From ken@turtlevax.UUCP (Ken Turkowski) Sun Nov 18 15:41:29 1984 Date: 18 Nov 84 20:41:29 GMT > >> It would also be nice if sin( x ) and cos( x ) could be > >> computed simultaneously with reduced cost. I doubt if this is > >> possible but would like to know if it is. > > > sine = sin(x); > > cosine = sqrt(1.0 - sine*sine); > > I wish people would check things out before posting them. On our > 4.2BSD system, the sqrt() approach is actually SLOWER than using > cos(). (Under UNIX System V it is very slightly faster.) Now, > if I wanted cos(x)^2, I would certainly use 1 - sin(x)^2, which > I already knew about (I too went to high school). > > What I had in mind was more along the lines of a CORDIC algorithm > or some other obscure but useful approach. In Tom Duff's SIGGRAPH '84 tutorial on Numerical Methods for Computer Graphics, he writes (quote until end of article): Calculating Euclidean distance probably accounts for 90% of the square roots expended in computer graphics applications. Aside from the obvious geometric uses, most illumination models require that the unit normal to each surface be computed at each pixel. [Moler-Morrison, "Replacing Square Roots by Pythagorean Sums", IBM Journal of Research and Development", vol. 27, no. 6, pp. 577-581, Nov. 1983] describes an algorithm for computing sqrt(a^2+b^2) which is fast, robust and portable. The naive approach to this problem (just writing sqrt(a*a+b*b) ) has the problem that for many cases when the result is a representable floating point number, the intermediate results may overflow or underflow. This seriously restricts the usable range of a and b. Moler and Morrison's method never causes harmful overflow or underflow when the result is in range. The algorithm has _____cubic convergence, and depending on implementation details may be even faster than sqrt(a*a+b*b). Here is a C function that implements their algorithm: double hypot(p,q) double p,q; { double r, s; if (p < 0) p = -p; if (q < 0) q = -q; if (p < q) { r = p; p = q; q = r; } if (p == 0) return 0; for (;;) { r = q / p; r *= r; s = r + 4; if (s == 4) return p; r /= s; p += 2 * r * p; q *= r; } } This routine produces a result good to 6.5 digits after 2 iterations, to 20 digits after 3 iterations, and to 62 digits after 4 iterations. In normal use you would probably unwind the loop as many times as you need and omit the test altogether. Moler and Morrison's paper describes the generalization of this algorithm to n dimensions, and exhibits a related square root algorithm which also has cubic convergence. [Dubrelle, "A Class of Numerical Methods for the Computation of Pythagorean Sums", IBM Journal of Research and Development, vol. 27, no. 6, pp. 582-589, Nov. 1983] gives a geometric justification of the algorithm and presents a set of generalizations that have arbitrarily large order of convergence. -- Ken Turkowski @ CADLINC, Palo Alto (soon Menlo Park), CA UUCP: {amd,decwrl,flairvax,nsc}!turtlevax!ken ARPA: turtlevax!ken@DECWRL.ARPA