A few weeks ago I came across this problem:
Find a number consisting of 9 digits in which each of the digits from 1 to 9 appears only once. This number must also satisfy these divisibility requirements:
- The number should be divisible by 9.
- If the rightmost digit is removed, the remaining number should be divisible by 8.
- If the rightmost digit of the new number is removed, the remaining number should be divisible by 7.
- And so on, until there’s only one digit (which will necessarily be divisible by 1).
This problem has been around for a few years and many people have solved it using a variety of different programming languages. I decided to tackle it using Verilog targeting an Altera Cyclone IV FPGA development board. My design uses the board’s 50MHz clock and finds the solution in approximately 2.2ms.
There are three main components to the design: a module to calculate all of the unique combinations of the digits, a module to perform the division on the resulting combinations, and a LCD interface module to display the answer once found.
I started with pen and paper writing out all of the unique combinations for small sets of symbols to get a feel for the underlying algorithm.

It became clear that the process is recursive: to find all the unique combinations of the elements in a set you pull out each element in turn and find the unique combinations of the remaining elements. I try to show this process in the graph below.

Traversing every path from trunk to leaf yields all of the combinations. Notice that each node with n elements has n branches. This means that a set with n unique elements has n! unique combinations. Therefore, for this problem with nine digits there are 9! = 362,880 combinations to check (worst case).
At first I was afraid this would be one of those problems that just doesn’t solve cleanly without using recursion. Real recursion can’t be implemented with synthesizable Verilog (and it just doesn’t have that warm fuzzy feeling of a bunch of logic gates working together in parallel). Fortunately, I was able to unwrap the recursive process into something reasonable using for loops. Here’s some C-like pseudo-code for finding combinations of a set with 4 characters:
set = "abcd"
for (i=0; i<4; i++) {
stage1 = set
stage1[0] = set[i]
stage1[i] = set[0]
for (j=1; j<4; j++) {
stage2 = stage1
stage2[1] = stage1[j]
stage2[j] = stage1[1]
for (k=2; k<4; k++) {
stage3 = stage2
stage3[2] = stage2[k]
stage3[k] = stage2[2]
}
}
}
I implemented this by cascading a series of counters with a module to handle digit swapping. For a number with n unique digits we need n-1 stages. Note that it’s n-1 and not n stages because the nth stage is the trivial case of finding all the combinations of a set with just one element. Here’s a schematic showing how it looks for a set with 3 digits.

The counter to implement the for loop functionality was straight forward. Each counter increments by 1 when the downstream counter has maxed out. Once a counter hits the max value it rolls over back to its initial value.
always@(posedge rst_i, posedge clk_i) begin
if (rst_i) begin
cntr_o <= CNTR_INIT;
end else if (incr_i) begin
if (cntr_o < CNTR_LIMIT) begin
cntr_o <= cntr_o + 1;
end else begin
cntr_o <= CNTR_INIT;
end
end
end
assign done_o = ena_i & (cntr_o == CNTR_LIMIT);
Note that for the counter’s initial value and limit (maximum) are constants so I made them parameters instead of inputs to the module.
Below is the part of DigitSwap that does the swapping:
always@(*) begin
digits_o = digits_i;
digits_o[DIGIT(position1_i)] = digits_i[DIGIT(position2_i)];
digits_o[DIGIT(position2_i)] = digits_i[DIGIT(position1_i)];
end
Notice that this is a combinatorial process with blocking assignments (rather than synchronous non-blocking) so that every stage swaps its digits in the same clock cycle. This results in a unique combination every clock. Also, I defined a macro called `DIGIT (which makes indexing the digits in a bit array more friendly) but I’ve removed the leading ` in these code snippets because of a quirk in the syntax highlighting.
Finally, I used a generate loop to cascade the modules together:
genvar c;
generate
for (c=1; c<NUM_OF_DIGITS; c=c+1) begin : GEN_COMBINATIONS
assign incr[c] = done[c+1];
Counter #(
.CNTR_INIT ( c ),
.CNTR_LIMIT ( NUM_OF_DIGITS )
)
Counter(
.rst_i ( rst_i ),
.clk_i ( clk_i ),
.ena_i ( incr[c] ),
.cntr_o ( cntr[c] ),
.done_o ( done[c] )
);
DigitSwap #(
.NUM_OF_DIGITS ( NUM_OF_DIGITS ),
.DIGIT_WIDTH ( DIGIT_WIDTH )
)
DigitSwap(
.clk_i ( clk_i ),
.digits_i ( digits[c-1] ),
.position1_i ( cntr[c] ),
.position2_i ( c ),
.digits_o ( digits[c] )
);
end
endgenerate
assign done[NUM_OF_DIGITS] = comb_ena_i;
assign comb_vld_o = 1;
assign comb_digits_o = digits[NUM_OF_DIGITS-1];
Now that all of the combinations are generated the next step is to check them. However, up until this point the digits were just packed into an array of bits (like BCD). These digits must first be converted into an actual binary number (for example,’h01020304 => ‘d1234). I’m pretty happy with the conversion method I came up with. I was amazed that it met timing and that I didn’t have to pipeline it:
always@(posedge clk_i) begin
binary_o = 0;
for (d=1; d<=NUM_OF_DIGITS; d=d+1) begin
binary_o = binary_o + digits_i[DIGIT(d)]*(10 ** (NUM_OF_DIGITS-d));
end
done_o <= ena_i;
end
Now it’s just divide and check.

All eight dividers operate in parallel. When all of the dividers report a zero remainder then the solution has been found. I decided against being clever and didn’t use any tricks to do the division. Each division is done with a divider module generated by the Altera tools. They were instantiated using a generate loop (similar to the one shown earlier).
Once I had my answer I just had to display it on the LCD. This part was fairly anti-climatic and kind of a pain. At least now I have a LCD interface module that I can use in the future.
Here’s what it looks like:

Someone on Reddit was asking about implementing this function in synthesizable Verilog:
\[\Large{\frac{x}{3}e^{(1-\frac{x}{3})}}\]
I came up with this (meta) solution that uses behavioral Verilog to generate a synthesizable Verilog case statement which implements the desired function as a look-up table:
module SomebodysHomework;
parameter BITS_OF_PRECISION = 8;
parameter SCALING_FACTOR = 1.0;
real x = 0.0;
integer result = -1;
function integer JustSomeMath;
input real x;
parameter real e = 2.7183;
real y;
begin
y = x/3.0 * e ** (1.0 - x/3.0);
JustSomeMath = integer(y * (1<<BITS_OF_PRECISION));
end
endfunction
initial begin
$display ("case (x)");
while (result != 0) begin
#1; x = x + 1;
result = JustSomeMath(x/SCALING_FACTOR);
$display(" %0d : answer = %0d;", x, result);
end
$display(" default: answer = 0;");
$display("endcase");
end
endmodule
Here’s part of the case statement that gets generated by the code above.
case (x)
1 : answer = 166;
2 : answer = 238;
3 : answer = 256;
...
27 : answer = 1;
28 : answer = 1;
29 : answer = 0;
default: answer = 0;
endcase
Here are some pretty pictures of the results with different values for BITS_OF_PRECISION and SCALING_FACTOR.

