--- /dev/null
+---
+postid: 06c
+title: Egyptian mul and div "work": a correctness proof for the arithmetic operations in FFA Chapter 5
+date: February 23, 2018
+author: Lucian Mogoșanu
+tags: math
+---
+
+In the [fifth chapter][ffa-ch5] of the [FFA][ffa] series, Stanislav
+presents a slow yet entirely usable constant-time implementation of
+division and multiplication, and most importantly, one that
+[fits in head][fits-in-head]. Following up on our
+[previous exercise][ffacalc-puzzle], we read the chapter and finally
+arrive at a set of challenges. Cherry-picking[^1] through them, we see:
+
+> * Prove that the provided Division and Multiplication operations,
+> work.
+
+This article is an exercise in exactly that. But before delving into the
+gory details, we must clarify what it means (from the author's point of
+view and in this particular context) a. to prove, and b. that something
+"works".
+
+While the proof of correctness is indeed mathematical, it is not of the
+"formal verification" type in the sense of
+[mechanical gargle][reversing-lists] found in today's literature. That
+said, we aim to import a minimal body of prior knowledge, consisting
+mostly of basic arithmetics and highschool algebra. We will also, when
+absolutely needed, introduce assumptions and as much as possible state
+them explicitly.
+
+The article presents an *algorithmic* proof. That is, we read through
+the code and we manually translate it into very similar pseudo-code when
+needed, and then we manually translate that into mathematical
+expressions, for the purpose of algebraic manipulation. Thus egyptian
+division and multiplication implementations "work" only in the sense
+that we carefully examine and understand them, and then in the sense
+that their algorithmic counterparts are led through a deductive process
+which then leads us to the conclusion. Sadly, we can make no strong
+statements about either the compiler[^2] or the hardware (architecture
+and implementation thereof), so we are forced to make the otherwise
+unwarranted assumption that they also work.
+
+We also assume that the basic operations presented in previous
+[FFA chapters][ffa] -- e.g. addition, subtraction, muxing, shifts --
+work by the definition of "works" in the previous paragraph. The reader
+is of course encouraged to not take this statement at face value and
+(re)examine the implementation of the operations carefully.
+
+We divide the remainder of this article into two pieces. As both the
+implementation and the underlying equations for egyptian multiplication
+are simpler, we begin with a correctness proof for it and then later on
+we continue with the division operation.
+
+## Part the first: multiplication
+
+We begin with a description and a code snippet[^3] of
+`FZ_Mul_Egyptian`. The partial and final results of the operation are
+held in a variable (`XY`) double the size of the input (`X` and `Y`). At
+first, `XY` is cleared (all bits set to zero) and a set of "slides",
+`XS` and `YS`, are initialized with the values of `X` and `Y`
+respectively; in particular, `XS` is the same size as `XY`, as `X` must
+be repeatedly added into `XY`.
+
+Then the implementation loops over `YS`, also modifying `XS`:
+
+~~~~ {.ada}
+ -- For each bit of Y:
+ for i in 1 .. FZ_Bitness(Y) loop
+
+ -- If lowest bit of Y-Slide is 1, X-Slide is added into XY
+ FZ_Add_Gated(X => XY, Y => XS, Sum => XY,
+ Gate => FZ_OddP(YS));
+
+ -- X-Slide := X-Slide * 2
+ FZ_ShiftLeft(XS, XS, 1);
+
+ -- Y-Slide := Y-Slide / 2
+ FZ_ShiftRight(YS, YS, 1);
+
+ end loop;
+~~~~
+
+Then the output variables are set to `XY`.
+
+The first observation we make is that, `FZ_Add_Gated` being a
+conditional add operation very similar to `FZ_Mux`, we can also look at
+it as:
+
+~~~~ {.ada}
+ -- If lowest bit of Y-Slide is 1, X-Slide is added into XY
+ if FZ_OddP(YS) then
+ FZ_Add(X => XY, Y => XS, Sum => XY);
+ else
+ FZ_Add(X => XY, Y => FZ_Zero, Sum => XY);
+ end if;
+~~~~
+
+where `FZ_Zero` is a `FZ` constant of the same size as `XY` and `XS`,
+with all bits set to zero. Using our previous assumption about the
+correctness of `FZ_Add`, we can rewrite this in the following pseudocode
+form:
+
+~~~~
+ if YSbit(1) is 1 then
+ XY := XY + XS;
+ else -- YSbit(1) is 0
+ XY := XY + 0;
+ end if;
+~~~~
+
+where `+` here is an equivalent to `FZ_Add` (and consequently we will
+later assume that `-` represents an equivalent to `FZ_Sub`) and
+`YSbit(I)` represents the `I`th ("eldest", least significant) bit of
+`YS`. We will further say that `FZ_Bitness(Y) = N`, so that `1 <= I <=
+N`, i.e. 1 is the least significant bit of `YS`, while N is the most
+significant.
+
+Observe that `XY` is updated (`XS` is added to it) only when `YSbit(1) =
+1`. Thus we can make one (final) modification to this bit, by explicitly
+introducing `YSbit(1)` in the update instruction. The equivalent
+(branchless!) assignment would be:
+
+~~~~
+ XY := XY + YSbit(1) * XS;
+~~~~
+
+where `*` here denotes a multiplication operation between a bit and a
+`FZ`, however in this case perfectly equivalent to the muxing in
+`FZ_Add_Gated`.
+
+Furthermore we rewrite the shifting operations as equivalent operations
+that multiply (`*`) and divide (`/`) by 2:
+
+~~~~
+ -- X-Slide
+ XS := XS * 2;
+
+ -- Y-Slide
+ YS := YS / 2;
+~~~~
+
+The full algorithmic translation of `FZ_Mul_Egyptian` then becomes:
+
+~~~~
+FZ_Mul_Egyptian(X, Y) is:
+-- Initialize variables
+1. XY := 0;
+2. XS := X;
+3. YS := Y;
+-- Over the bits of Y
+4. for I in 1 .. N loop
+ -- Partial product
+5. XY := XY + YSbit(1) * XS;
+ -- Partial X-Slide
+6. XS := XS * 2;
+ -- Partial Y-Slide
+7. YS := YS / 2;
+8. end loop;
+9. Result := XY;
+~~~~
+
+This version of the algorithm looks pretty good and it can help us model
+the problem mathematically, but it's not quite the final version. Diana
+[identified][diana-ys] that instead of using the first bit of `YS` for
+muxing and then shifting `YS` to the right at each step, one can
+optimize the execution time of `FZ_Mul_Egyptian` by simply using the
+`I`th bit of `Y`. We'll do the same here, with the mention that this
+makes our algorithm smaller and it helps us more easily identify an
+[inductive][induction] pattern. Let's first view the modified (and
+final) version of the egyptian multiplication algorithm:
+
+~~~~
+FZ_Mul_Egyptian(X, Y) is:
+-- Initialize variables
+1. XY := 0;
+2. XS := X;
+-- Over the bits of Y
+3. for I in 1 .. N loop
+ -- Partial product
+4. XY := XY + Ybit(I) * XS;
+ -- Partial X-Slide
+5. XS := XS * 2;
+6. end loop;
+7. Result := XY;
+~~~~
+
+Note the changes: a. we removed the Y-slide; and b. at each step of the
+loop, `Ybit(I)` is used instead of `YSbit(1)`.
+
+Now we can begin to reason about the hardest part of this algorithm, the
+loop. While, as described, the loop goes over the bits of `Y` (i.e. they
+are our "induction variable"), we can see it as the computation of two
+recurrence relations, that is, of the following finite sequences. We
+begin with `XS`:
+
+~~~~
+XS(0) = X
+XS(I) = XS(I-1) * 2, 1 <= I <= N
+~~~~
+
+This has so far been the largest leap in logic in our article, crossing
+suddenly from an algorithmic view of the problem to a mathematical one;
+thus I ask the reader to sit for a moment and contemplate this leap. At
+the end, the correspondence between the `XS(I)` in the recurrence and
+the `XS` in the algorithm should hopefully become obvious: at each step,
+the value of `XS` is doubled, and this has an algebraic significance, as
+we will see in a moment. Now we model the update of `XY` similarly:
+
+~~~~
+XY(0) = 0
+XY(I) = XY(I-1) + Ybit(I) * XS(I-1), 1 <= I <= N
+~~~~
+
+Observe how `XY(I)` is computed based on `XS(I-1)`; to get this
+intuition, try to mentally (or on the paper) run the first step of the
+algorithm: `XY` is updated *before* `XS`, and thus `XY(1)` is computed
+based on `XS(0)`, `XY(2)` based on `XS(1)` and so on.
+
+The problem of computing the arithmetic product egyptologically then
+becomes equivalent to finding the value of `XY(N)`. And thus in order to
+show that egyptian multiplication is correct, we need to find the
+general form of this sequence. This can be solved by induction. First we
+need to do `XS`:
+
+~~~~
+XS(0) = X
+XS(1) = XS(0) * 2 = X * 2
+XS(2) = XS(1) * 2 = X * 2 * 2 = X * 2^2
+...
+-- by induction
+XS(I) = XS(I-1) * 2 = X * 2^(I-1) * 2 = X * 2^I
+-- thus, the general form of XS(I)
+XS(I) = X * 2^I
+~~~~
+
+Now the harder part, we do the same for `XY`:
+
+~~~~
+XY(0) = 0
+XY(1) = XY(0) + Ybit(1) * XS(0)
+ = 0 + Ybit(1) * X = Ybit(1) * X
+XY(2) = XY(1) + Ybit(2) * XS(1)
+ = Ybit(1) * X + Ybit(2) * X * 2
+-- let's do one more
+XY(3) = XY(2) + Ybit(3) * XS(2)
+ = Ybit(1) * X * 2^0 +
+ Ybit(2) * X * 2^1 +
+ Ybit(3) * X * 2^2
+...
+-- induction hypothesis
+XY(N-1) = Ybit(1) * X * 2^0 +
+ Ybit(2) * X * 2^1 +
+ ... +
+ Ybit(N-1) * X * 2^(N-2)
+-- induction step proof
+XY(N) = XY(N-1) + Ybit(N) * XS(N-1) -- subst. XS(N-1)
+ = XY(N-1) + Ybit(N) * X * 2^(N-1) -- subst. ind. hyp.
+ = Ybit(1) * X * 2^0 +
+ Ybit(2) * X * 2^1 +
+ ... +
+ Ybit(N-1) * X * 2^(N-2) +
+ Ybit(N) * X * 2^(N-1)
+-- QED.
+~~~~
+
+The reader is of course encouraged to replicate the result using pen and
+paper.
+
+Now that we have the general form of `XY(N)` beneath our eyes, the
+question remains, what do we do with it? The astute reader has probably
+observed that we have been juggling with meanings for our ideal
+multiplication operation, i.e. `*`. For example `Ybit(I) * X` has a
+somewhat different meaning than `X * 2^(I-1)`. We will therefore intuit
+that `*` is in fact the arithmetic multiplication and it has the very
+same properties that we know from school. We then observe that we can
+factor out the common `X` in the sum, and thus `XY(N)` becomes:
+
+~~~~
+XY(N) = X * [Ybit(1) * 2^0 + Ybit(2) * 2^1 + ... + Ybit(N) * 2^(N-1)]
+~~~~
+
+which brings us to some notions of basic algebra and number theory:
+given an arbitrary number `K` and a base `B > 1`, then `K` can be
+decomposed into the sum:
+
+~~~~
+K(1) * B^0 + K(2) * B^1 + K(3) * B^2 + ...
+~~~~
+
+where `0 <= K(I) < B`, i.e. `K(I)` is a "modulo-B digit". Our particular
+`K(I)` is `Ybit(I)` and it denotes the elements of the bit-vector
+`Y`. From this follows that the sum-of-products factor in the definition
+of `XY(N)` is in fact `Y`, and thus `XY(N)` becomes:
+
+~~~~
+XY(N) = X * Y
+~~~~
+
+This then settles it. A number of refinement steps have brought us from
+repeated `FZ_Add` operations to the mathematical `*`; the reverse
+(un-)refinement is also possible[^4], which serves to prove that the
+algorithm implemented by `FZ_Mul_Egyptian` and the algebraic `*` are
+equivalent. The reader is, of course, encouraged to challenge this.
+
+More importantly, these refinement steps serve to show us what egyptian
+multiplication *is*: given a base `B` and two numbers `X` and `Y`,
+egyptian multiplication finds the answer to `X * Y` by adding `X` and
+multiplying it (the same `X`) by `B` a number of times equal to the
+number of digits (in base-`B` representation) of `Y`. Programmers can
+then implement the same algorithm on ternary or quaternary computers,
+etc.[^5]
+
+We move on to the trickier part of our article: egyptian division.
+
+## Part the second: division
+
+Egyptian division is very similar to its multiplication counterpart,
+only it is considerably trickier to comprehend. As `X * Y` means "to add
+`X` a number of times equal to `Y`", so does `X / Y` mean "to subtract
+`Y` from `X` a number of times equal to `Q`, until a number `R`
+remains". The egyptological principle remains the same: we are in base 2
+and we shift, subtract and add-conditionally `N` times. But before
+laying the implementation of `FZ_IDiv` before our eyes, let us consider
+a few aspects inherent to division.
+
+Computing `X / Y` involves finding two numbers, `Q` and `R`, so that `X
+= Y * Q + R`. We call `X` the divisor and `Y` the dividend; and an old
+theorem called the quotient-remainder theorem[^6] guarantees us that for
+arbitrary integers `X` and `Y`, there exist `Q` and `R` satisfying the
+relation given here; and furthermore, that `Q` and `R` *are
+unique*. `FZ_IDiv` is thus supposed to correctly find these two numbers.
+
+Now, for the implementation. `FZ_IDiv` declares `Q` and `R` as a single
+register, `QR`, whose `N` least significant bits are initially equal to
+the dividend, and at the end will contain `Q`. The rest of the bits in
+`QR` are initially zero, and at the end will contain the value of
+`R`. The (supposed) "induction" part looks as follows:
+
+~~~~ {.ada}
+ for i in 1 .. FZ_Bitness(Dividend) loop
+
+ -- Advance QR by 1 bit:
+ FZ_ShiftLeft(QR, QR, 1);
+
+ -- Subtract Divisor from R; Underflow goes into C
+ FZ_Sub(X => R, Y => Divisor, Difference => R, Underflow => C);
+
+ -- If C=1, subtraction underflowed, and then Divisor gets added back:
+ FZ_Add_Gated(X => R, Y => Divisor, Gate => C, Sum => R);
+
+ -- Current result-bit is equal to Not-C, i.e. 1 if Divisor 'went in'
+ FZ_Or_W(Q, W_Not(C));
+
+ end loop;
+~~~~
+
+We take the first left shift first. Shifting `QR` does (from an
+arithmetic point of view) two things: it multiplies `Q` by 2 and it
+overflows bits of `Q` into `R`; the latter meaning the multiplication of
+`R` by 2 and adding the "newest" (most significant) bit of `Q` to
+`R`. We can thus rewrite it as:
+
+~~~~
+ -- Advance QR by 1 bit:
+ R := R * 2 + Qbit(N);
+ Q := Q * 2; -- NB: mod 2^N!
+~~~~
+
+The remaining operations in the loop are dependent upon the result of
+the call to `FZ_Sub`, which is an implicit `R < Divisor` comparison. If
+this comparison fails, then the subtraction is undone using
+`FZ_Add_Gated` and the least significant bit of `Q` is set to zero;
+else, `Divisor` remains subtracted from `R` and the least significant
+bit of `Q` is set to one. As in the case of multiplication, we rewrite
+this into a pseudo-code using explicit conditionals:
+
+~~~~
+ -- Check (R - Divisor) underflow
+ if R < Divisor then
+ R := R - 0;
+ Q := Q + 0;
+ else
+ R := R - Divisor;
+ Q := Q + 1;
+ end if;
+~~~~
+
+The intuition is the exact one from long division: once (or if) enough
+bits of the dividend have overflown into `R`, we subtract the divisor
+and we set the digit (i.e. bit) of the quotient accordingly. As with
+multiplication, we can get rid of the complicated dependency on specific
+bits at each step. We observe that at the first step (`I = 1`) `Qbit(N)`
+addresses the `N`th bit of `Dividend`, then at `I = 2` the `N-1`th and
+so on until `I = N` and the addressing of the first bit of
+`Dividend`. Thus we make the first update of `R` depend on
+`Dividendbit(N-I+1)` instead of `Qbit(N)`.
+
+Thus the algorithmic form (or one of the forms) of `FZ_IDiv` is:
+
+~~~~
+FZ_IDiv(Dividend, Divisor) is:
+-- Initialize variables
+1. Q := Dividend;
+2. R := 0;
+-- Over the bits of Dividend
+3. for I in 1 .. N loop
+ -- Update non-conditional values of R and Q
+4. R := R * 2 + Dividendbit(N-I+1);
+5. Q := Q * 2;
+ -- Conditional updates of R and Q
+6. if R < Divisor then
+7. R := R - 0;
+8. Q := Q + 0;
+10. else
+11. R := R - Divisor;
+12. Q := Q + 1;
+13. end if;
+14. end loop;
+15. Result := Q, R;
+~~~~
+
+Now comes the tricky part, that is, putting the algorithm in a
+mathematical form that allows us to reason about it. At least two
+aspects are not exactly easy to spot: `R` and `Q` suffer two updates
+each, and the `R` in `R < Divisor` (on which the conditional updates to
+`Q` and `R` depend) is not the same `R` as in the previous iteration of
+the loop. We'll try to untangle this first, by defining a
+*conditional*-`R` (`CR`) which maps to the first update to `R`:
+
+~~~~
+-- for 1 <= I <= N
+CR(I) = R(I-1) * 2 + Dividendbit(N-I+1)
+~~~~
+
+Next, since it's not obvious yet how we could get rid of the explicit
+conditionals -- as we did with multiplication by factoring in `Ybit(.)`
+-- we define *deltas* for `Q` and `R`, depending explicitly on
+`CR(I)`. That is:
+
+~~~~
+-- for 1 <= I <= N
+DR(I) = { 0, if CR(I) < Divisor
+ Divisor, otherwise }
+DQ(I) = { 0, if CR(I) < Divisor
+ 1, otherwise }
+~~~~
+
+The recurrence relation for `R` is then:
+
+~~~~
+R(0) = 0
+R(I) = CR(I) - DR(I)
+ = R(I-1) * 2 + Dividendbit(N-I+1) - DR(I),
+ 1 <= I <= N
+~~~~
+
+and for `Q`:
+
+~~~~
+Q(0) = Dividend
+Q(I) = Q(I-1) * 2 + DQ(I), 1 <= I <= N
+~~~~
+
+As before, these recurrences are a leap from the algorithmic to the
+mathematical, so it might be completely non-obvious where the relations
+come from. The reader is recommended to go through the mental gymnastics
+necessary to obtain them by using pen and paper. At the end it should be
+clear that `Q(N)` is the resulting quotient, while `R(N)` is the
+resulting remainder, and furthermore that the equation:
+
+~~~~
+Dividend = Divisor * Q(N) + R(N)
+~~~~
+
+holds true.
+
+Now let's try to figure out a general form for `Q(I)` first, since this
+is the simplest of the two:
+
+~~~~
+Q(0) = Dividend
+Q(1) = Q(0) * 2 + DQ(1) = Dividend * 2 + DQ(1)
+Q(2) = Q(1) * 2 + DQ(2)
+ = [Dividend * 2 + DQ(1)] * 2 + DQ(2)
+ = Dividend * 2^2 + DQ(1) * 2 + DQ(2)
+-- let's do one more
+Q(3) = Q(2) * 2 + DQ(3)
+ = [Dividend * 2^2 + DQ(1) * 2 + DQ(2)] * 2 + DQ(3)
+ = Dividend * 2^3 + DQ(1) * 2^2 + DQ(2) * 2 + DQ(3)
+...
+-- induction hypothesis
+Q(N-1) = Dividend * 2^(N-1) +
+ DQ(1) * 2^(N-2) +
+ DQ(2) * 2^(N-3) +
+ ... +
+ DQ(N-2) * 2^1 +
+ DQ(N-1) * 2^0
+-- induction step proof
+Q(N) = Q(N-1) * 2 + DQ(N) -- subst. Q(N-1)
+ = [Dividend * 2^(N-1) +
+ DQ(1) * 2^(N-2) +
+ DQ(2) * 2^(N-3) +
+ ... +
+ DQ(N-2) * 2^1 +
+ DQ(N-1) * 2^0 ]
+ * 2 + DQ(N)
+ = Dividend * 2^N +
+ DQ(1) * 2^(N-1) +
+ DQ(2) * 2^(N-2) +
+ ... +
+ DQ(N-1) * 2^1 +
+ DQ(N) * 2^0
+-- QED.
+~~~~
+
+Moreover, all multiplications are modulo `2^N` (we fit the the results
+in `N` bits), so the term `Dividend * 2^N` vanishes. We are left with a
+base-2 number:
+
+~~~~
+Q(N) = DQ(1) * 2^(N-1) + DQ(2) * 2^(N-2) + ... + DQ(N) * 2^0
+~~~~
+
+As for `R(I)`:
+
+~~~~
+R(0) = 0
+R(1) = R(0) * 2 + Dividendbit(N) - DR(1)
+ = Dividendbit(N) - DR(1)
+R(2) = R(1) * 2 + Dividendbit(N-1) - DR(2)
+ = [Dividendbit(N) - DR(1)] * 2 + Dividendbit(N-1) - DR(2)
+ = Dividendbit(N) * 2 - DR(1) * 2 + Dividendbit(N-1) - DR(2)
+-- one moar
+R(3) = R(2) * 2 + Dividendbit(N-2) - DR(3)
+ = [Dividendbit(N) * 2 + Dividendbit(N-1) - DR(1) * 2 - DR(2)] * 2 +
+ Dividendbit(N-2) - DR(3)
+ = [Dividendbit(N) * 2^2 +
+ Dividendbit(N-1) * 2 +
+ Dividendbit(N-2)] -
+ [DR(1) * 2^2 + DR(2) * 2 + DR(3)]
+~~~~
+
+Observe how we grouped the terms by `Dividendbit(.)` and `DR(.)`; we
+anticipate for `R(.)` to be the difference between two numbers. Now for
+the induction hypothesis and step substitution we use an index `I < N`,
+so as to not confuse with the `N` in `Dividendbit(N)`. We are very
+careful to match the index to `R(.)` with the powers of two in the
+terms:
+
+~~~~
+-- induction hypothesis
+R(I-1) = [Dividendbit(N) * 2^(I-2) +
+ Dividendbit(N-1) * 2^(I-3) +
+ ... +
+ Dividendbit(N-I+3) * 2^1 +
+ Dividendbit(N-I+2) * 2^0] -
+ [DR(1) * 2^(I-2) +
+ DR(2) * 2^(I-3) +
+ ... +
+ DR(I-2) * 2^1 +
+ DR(I-1) * 2^0]
+-- induction step
+R(I) = R(I-1) * 2 + Dividendbit(N-I+1) - DR(I)
+ = [Dividendbit(N) * 2^(I-2) +
+ ... +
+ Dividendbit(N-I+2)] * 2 -
+ [DR(1) * 2^(I-2) +
+ ... +
+ DR(I-1)] * 2 +
+ Dividendbit(N-I+1) - DR(I)
+ = [Dividendbit(N) * 2^(I-1) +
+ Dividendbit(N-1) * 2^(I-2) +
+ ... +
+ Dividendbit(N-I+2) * 2^1 +
+ Dividendbit(N-I+1) * 2^0] -
+ [DR(1) * 2^(I-1) +
+ DR(2) * 2^(I-2) +
+ ... +
+ DR(I-1) * 2^1 +
+ DR(I) * 2^0]
+-- QED.
+~~~~
+
+Then `R(N)` is:
+
+~~~~
+R(N) = [Dividendbit(N) * 2^(N-1) +
+ Dividendbit(N-1) * 2^(N-2) +
+ ... +
+ Dividendbit(2) * 2^1 +
+ Dividendbit(1) * 2^0] -
+ [DR(1) * 2^(N-1) + DR(2) * 2^(N-2) + ... + DR(N)]
+~~~~
+
+The reader is again recommended to follow the reasoning using pen and
+paper.
+
+The (by now very) keen reader will notice that the first bracket in the
+definition of `R(N)` represents the very definition of `Dividend`; the
+second definition doesn't seem to be anything in particular, so we'll
+give the (not-so-inspired) name `DR`, leading to:
+
+~~~~
+R(N) = Dividend - DR
+-- same as:
+Dividend = DR + R(N)
+~~~~
+
+On a first glance, it would seem that we're stuck, but there's a
+wonderful relation hidden behind the equation above. Remember, our goal
+is to find out whether:
+
+~~~~
+Dividend = Divisor * Q(N) + R(N)
+~~~~
+
+and we also know that:
+
+~~~~
+-- for 1 <= I <= N
+DR(I) = { 0, if CR(I) < Divisor
+ Divisor, otherwise }
+DQ(I) = { 0, if CR(I) < Divisor
+ 1, otherwise }
+~~~~
+
+So what if we wrote `DR(I)` this way?
+
+~~~~
+DR(I) = Divisor * DQ(I), 1 <= I <= N
+~~~~
+
+This relation is perfectly valid, and it helps us rewrite `DR` as:
+
+~~~~
+DR = DR(1) * 2^(N-1) + DR(2) * 2^(N-2) + ... + DR(N)
+ = Divisor * DQ(1) * 2^(N-1) +
+ Divisor * DQ(2) * 2^(N-2) +
+ ... +
+ Divisor * DQ(N) * 2^0
+ = Divisor * [DQ(1) * 2^(N-1) + DQ(2) * 2^(N-2) + ... + DQ(N)]
+ = Divisor * Q(N)
+~~~~
+
+which seals the deal for us.
+
+Let's now take a few steps back and explain how long egyptological
+division works:
+
+1. We initialize `R(0)` and `Q(0)`.
+2. At each step `I` we:
+ a. compute a `CR(I)`, deciding whether the divisor should be
+ subtracted;
+ b. based on `CR(I)`, compute `DQ(I)` and `Q(I)`, adding 1 to the
+ quotient when a successful subtraction was performed; and
+ c. compute `DR(I)` and `R(I)`, which performs the actual
+ subtraction when `CR(I) >= Divisor`.
+3. At the end of `N` steps, we will have computed `R(N)` and `Q(N)`, the
+ unique values for which the quotient-remainder theorem holds.
+
+In conclusion, and to the best of my knowledge, the FFA implementations
+of egyptian multiplication and division work. I have therefore added my
+signature to the [code shelf][v]. Comments, reviews, improvements,
+generalizations and succintizations are more than welcome!
+
+[^1]: Though I wouldn't advise the reader to be selective, as there's a
+ lot of knowledge to be gained from doing the entire homework.
+
+ As for myself, I'm writing only this particular piece because a. to
+ my knowledge no one has yet published something similar, b. there's
+ only that much space on (virtual or otherwise) paper to cover
+ everything and c. I'd like to encourage readers to do their own
+ homework, publish it etc. instead of reading second-hand
+ material. That's why it's called home-work, after all.
+
+[^2]: Bear in mind that while Adacore provide auditable
+ [sources][adacore] for their Ada impementation, the compiler can
+ only be bootstrapped using another Ada compiler, which makes it
+ susceptible to [Thompson's hack][thompson],
+ i.e. [the problem of trust][problem-of-trust].
+
+[^3]: The reader is encouraged to read the
+ [full implementation][ffa-ch5] etc.
+
+[^4]: And a good exercise for the mathematically-inclined reader.
+
+[^5]: History has shown egyptian multiplication to work with some degree
+ of success on [decimal][two-two] computers. This works still, and
+ may continue to work for generations to come -- as long as one does
+ not forget his [whip][zarathustra].
+
+[^6]: Whose proof is left as an exercise to the reader.
+
+[ffa]: http://www.loper-os.org/?cat=49
+[ffa-ch5]: http://www.loper-os.org/?p=2071
+[fits-in-head]: http://btcbase.org/log-search?q=fits+in+head
+[ffacalc-puzzle]: /posts/y04/06a-ffa-ch4-puzzle.html
+[reversing-lists]: /posts/y03/057-reversing-lists.html
+[adacore]: https://www.adacore.com/download/more
+[thompson]: http://archive.is/neZCe
+[problem-of-trust]: /posts/y03/059-the-problem-of-trust.html
+[diana-ys]: http://www.loper-os.org/?p=2071&cpage=1#comment-18617
+[induction]: /posts/y00/01d-on-numbers-structure-and-induction.html
+[two-two]: /posts/y03/05f-two-plus-two.html
+[zarathustra]: http://archive.is/DyBi0#selection-4866.0-4870.1
+[v]: http://lucian.mogosanu.ro/v/