Variants of the Newton Raphson Iteration

Posted by Beetle B. on Tue 26 March 2019

Assume \(\beta=2\) for this section. Some of it may not work for decimal. We want to approximate \(b/a\).

Assume \(1\le a,b<2\). In other words, they are significands of floating point numbers.

To apply Newton’s algorithm, we want the root of the function \(f(x)=1/x-a\). This will give us \(1/a\). We multiply the result by \(b\). Now Newton’s algorithm is:

\begin{equation*} x_{n+1}=x_{n}-\frac{f(x_{n})}{f'(x_{n})} \end{equation*}

Thus we get \(x_{n+1}=x_{n}(2-ax_{n})\).

Now this converges to \(1/a\) for any \(x_{0}\in(0,2/a)\).

This has quadratic convergence. In reality, people use a combination of a lookup and this method (lookup is to figure out starting point).

One of the problems with this method is it cannot be parallelized. Also, beware of accumulated errors which have not been taken into account. In this particular case, the iteration self corrects - these errors are not that important.

We can restate the iteration as follows:

Let \(\epsilon_{n}=1-ax_{n}\), and then \(x_{n+1}=x_{n}+x_{n}\epsilon_{n}\). This is the same iteration, but it has the nice property that \(\epsilon_{n}\) can be computed exactly when \(x_{n}\) is within an ulp of \(a\) due to the corollary defined here.

Now consider another approach. We know that \(\epsilon\in(1,2)\). Thus if we let \(\epsilon=1-a\), then \(|\epsilon|<1\). Then

\begin{equation*} \frac{1}{a}=\frac{1}{1-\epsilon}=1+\epsilon+\epsilon^{2}+\epsilon^{3}+\cdots \end{equation*}
\begin{equation*} 1+\epsilon+\epsilon^{2}+\cdots=(1+\epsilon)(1+\epsilon^{2})(1+\epsilon^{4})(1+\epsilon^{8})\cdots \end{equation*}

Now let \(\epsilon_{n}=\epsilon^{2n}\). Then

\begin{equation*} \frac{1}{1-\epsilon}=(1+\epsilon_{0})(1+\epsilon_{1})(1+\epsilon^{2})\cdots \end{equation*}

Denote the RHS with \(x_{n}\).

Then:

  • \(x_{n+1}=x_{n}+x_{n}\epsilon_{n}\)
  • \(\epsilon_{n+1}=\epsilon_{n}^{2}\)

Now in reality, from a mathematical point of view, this is the same iteration as before: Start with \(\epsilon_{0}=1-ax_{0}\). Also, since \(\epsilon_{n}=1-ax_{n}\), take the square and you get:

\begin{equation*} \epsilon_{n}^{2}=1-ax_{n}(2-ax_{n}) \end{equation*}

But the quantity in brackets is just \(x_{n+1}\). Hence the total is \(\epsilon_{n+1}\).

But from a computational point of view, you can parallelize the two computations. But since \(a\) no longer appears in the iteration, there is a possibility of accumulation of rounding errors. The algorithm does not self correct. One way to get around this is to start with this scheme, and then switch to the previous ones as you approach convergence.

You can also calculate \(b/a\) directly by letting \(y_{n}=bx_{n}\). But then picking a good initial value is tricky.

The book also covers the Goldschmidt iteration, but it is not self correcting.

It covers one other iteration:

  • \(y_{n}=bx_{n}\)
  • \(\delta_{n}=b\epsilon_{n}\)

and the iteration is:

  • \(\delta_{n}=b-ay_{n}\)
  • \(y_{n+1}=y_{n}+\delta_{n}x_{n}\)

Usually one would use the earlier iterations to calculate \(1/a\), and then switch to this to approximate \(b/a\).