Wednesday, August 3, 2016

Pentium Bug Revisited

The jokes still circulate on the Internet:

Q: How many Pentium designers does it take to screw in a light bulb?
A: 1.99995827903, but that's close enough for nontechnical people.
At Intel, quality is job 0.999999998.
Q. What do you get when you cross a Pentium PC with a research grant?
A. A mad scientist.
The incident that triggered this barrage of ridicule occurred in the fall of 1994, a few months after Intel had introduced its Pentium microprocessor. The furor started with an e-mail message from Thomas R. Nicely, a mathematician at Lynchburg College in Virginia, who was doing computations related to the distribution of prime numbers. Nicely pointed out that the chip gave incorrect answers to certain floating-point (decimal) division calculations. Other users quickly confirmed the problem and identified additional examples in which an error occurs.

Here's one of the more dramatic instances:
x = 4195835, y = 3145727, and z = x - (x/y)*y.

Computed exactly, the answer, z, should be zero. The Pentium chip gave 256 as the answer.

Typically, the flawed chip generated a slightly inaccurate answer for only a few pairs of numbers. For example, dividing 5505001 by 294911 produced 18.66600093 instead of 18.66665197. Multiplying (or dividing) either number of the pair by any power of 2 also gave an incorrect answer. Intel claimed that only one in nine billion division operations would exhibit reduced precision.

So, the probability of randomly coming across the affected numbers was very small. But computations don't necessarily involve random selections of numbers, and such errors can matter in scientific and engineering applications and even in some spreadsheet calculations.

The fault itself had occurred because of the omission of five entries in a table of 1,066 values (part of the chip's circuitry) used by a speedy algorithm known as SRT division, after the initials of the three researchers who independently introduced the scheme more than 30 years ago. The five cells should have contained the constant +2, but because the cells were empty, the processor fetched a zero. This could throw off a calculation, which would result in a number that was always slightly less precise than the correct answer.

At risk were the binary divisors beginning with 1.0001, 1.0100, 1.0111, 1.1010, and 1.1101. These particular bit patterns had to be followed by a string of six 1s to boost the probability of error.

Intel corrected the problem and a variety of other glitches, and the Pentium chip (in all its variants) is now the workhorse of the desktop information age. However, that's not really the end of the story.

"It turns out the mathematics is more interesting than people think," says Alan Edelman, an applied mathematician at the Massachusetts Institute of Technology. "Most people link [the Pentium bug] to a simple arithmetic mistake, but it is far more subtle."

Edelman describes some of these nuances in a paper published in the March 1997 SIAM Review. "The bug in the Pentium was an easy mistake to make, and a difficult one to catch," he says.

The radix-4 SRT algorithm used on the Pentium chip represents a variation of the long-division method we learned in school. Suppose we want to divide 63 into 819762: 63 goes into 81 once, leaving a remainder of 18; then 9 is appended to the remainder to produce 189, and 63 divides into this number 3 times, leaving a remainder of zero. So far, the quotient is 13. The same procedure is repeated until the computation is complete.

The SRT algorithm does essentially the same thing, except that it works with the numbers -2, -1, 0, 1, and 2 instead of 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9 -- that is, in base 4 instead of base 10. Its advantage in a computer is that each step adds another 2-bit digit to the quotient, speeding up the rate at which division can be performed.

The Pentium chip's table, implemented as a triangular array, allowed it to look up values required for doing the computation. "When the algorithm was implemented in silicon, five entries on the boundary of this triangular portion were dropped," says Cleve Moler of MathWorks, Inc., who worked out a method for circumventing the problem. "These missing entries . . . effectively mean[s] that, for certain combinations of bits developed during the division process, the chip makes an error while updating the remainder."

Edelman's analysis reveals that a buggy value can be reached only from the entry below it in the chip's table. "This entry just below the bad entry plays the role of a foothold," he says. "The existence of this foothold is a subtle phenomenon that may not have been readily guessed from general properties of SRT division.

"I believe that its existence has surprised everybody," Edelman continues. "Any thought that each table entry is somehow equally likely to be reached is very wrong."

In effect, there are only two viable routes to arrive at a bug. Even then, it is quite possible to reach the foothold entry yet not hit the buggy entry on the next step. So, errors in the final output would occur only under very specific circumstances.

"Naively, one might have thought that the bad table entries would be hit uniformly at random," Edelman observes. "If this had been the case, the bug surely would have been noticed earlier and would have had the potential to cause more significant damage in every instance it went unnoticed."

Now, the Internet is buzzing with reports of a bug involving floating-point arithmetic in the new Pentium II microprocessor. Floating-point numbers are stored inside the microprocessor as strings of 80 bits, whereas integers are stored in 16 or 32 bits. Sometimes a floating-point number converted to an integer doesn't fit the smaller format, and the microprocessor is supposed to provide an "overflow" warning. Apparently, the Pentium II chip fails to do so in many cases.

Interestingly, the launch failure of the Ariane 5 rocket, which exploded 37 seconds after liftoff on June 4, 1996, occurred because of a software error that resulted from converting a 64-bit floating point number to a 16-bit integer. The value of the floating point number happened to be larger than could be represented by a 16-bit integer. The overflow wasn't handled properly, and in response, the computer cleared its memory. The memory dump was interpreted by the rocket as instructions to its rocket nozzles, and an explosion resulted.
It doesn't take much!

Originally posted May 12, 1997


Cipra, Barry. 1995. How number theory got the best of the Pentium chip. Science 267(Jan. 13):175.

Coe, Tim, et al. 1995. Computational aspects of the Pentium affair. IEEE Computational Science & Engineering (Spring):18-30.

Edelman, Alan. 1997. The mathematics of the Pentium division bug. SIAM Review 39(March):54-67. Available at

Fisher, Lawrence M. 1997. Flaw reported in new Intel chip. New York Times (May 5). Available at

Geppert, Linda. 1995. Biology 101 on the Internet: Dissecting the Pentium bug. IEEE Spectrum (February):16-17.

Moler, Cleve. 1995. A tale of two numbers. SIAM News 28(January):1.

Nuseibeh, Bashar. 1997. Ariane 5: Who dunnit? IEEE Software 14(May/June):15-16.

Peterson, Ivars. 1996. Fatal Defect: Chasing Killer Computer Bugs. New York: Vintage.

Thursday, January 2, 2014

Losing to Win

It's a gift to born losers. Researchers have demonstrated that two games of chance, each guaranteed to give a player a predominance of losses in the long term, can add up to a winning outcome if the player alternates randomly between the two games.

This striking result in game theory is now called Parrondo's paradox, after its discoverer, Juan M.R. Parrondo, a physicist at the Universidad Complutense de Madrid in Spain.

A combination of two losing gambling games illustrates this counterintuitive phenomenon. The two games involve tossing biased coins. In the simpler game, the player gambles with a coin that's been loaded to make the probability of winning less than 50 percent. Winning means that the player receives $1 and losing means that the player loses $1 on each turn.

Probability of winning: ½ - α
Probability of losing: ½ + α

The second, more complicated game requires two biased coins. One of the coins wins more often than it loses, and the other loses more often than it wins. The game is set up so that even though the winning coin is tossed more often, this is outweighed by the much lower probability of winning with the other coin.

Here's the rule for the two coins in the second game. If the player's total amount of cash on hand is a multiple of 3, the chance of winning is just 1/10 – α. If not, the chance of winning is higher: ¾ - α.

Is the total amount of cash on hand a multiple of 3?

Coin 2
Probability of winning: ¾ - α
Probability of losing: ¼ - α

Coin 3
Probability of winning: 1/10 – α
Probability of losing: 9/10 + α

When α is greater than zero, each game played repeatedly on its own gradually depletes a player's capital.

However, if a player starts switching between the two games, playing two turns of game 1, then two turns of game 2, and so on, he or she starts winning. Randomly switching between the games also results in a steady increase in capital. Indeed, playing games 1 and 2 in any sequence leads to a win.

Gregory P. Harmer and Derek Abbott of the University of Adelaide in Australia ran computer simulations of the games, demonstrating this counterintuitive result for 50,000 trials at α = 0.005.

Alternating between the games produces a ratchet-like effect. Imagine an uphill slope with its steepness related to a coin's bias. Winning means moving uphill. In the single-coin game, the slope is smooth, and in the two-coin game, the slope has a sawtooth profile. Going from one game to the other is like switching between smooth and sawtooth profiles. In effect, any winnings that happen to come along are trapped by the switch to the other game before subsequent repetitions of the original game can contribute to the otherwise inevitable decline.

The same type of ratchet effect can occur in a bag or can of mixed nuts, Abbott says. Brazil nuts tend to rise to the top because smaller nuts block downward movement of the larger nuts.

"There are actually many ways to construct such gambling scenarios," Harmer and Abbott commented in the Dec. 23/30, 1999 Nature. The researchers suggested that similar strategies may operate in the economic, social, or ecological realms to extract benefits from what look like detrimental situations.

Unfortunately, Parrondo's paradox doesn't work for the types of games played in casinos.

Original version posted March 6, 2000


Ball, P. 1999. Good news for losers. Nature Science Update (Dec. 23).

Blakeslee, S. 2000. Paradox in game theory: Losing strategy that winsNew York Times (Jan. 25).

Bogomolny, A. 2001. Parrondo paradox. Cut the Knot! (June).

Harmer, G.P., and D. Abbott. 1999. Losing strategies can win by Parrondo's paradoxNature 402(Dec. 23/30): 864.

McClintock, P.V.E. 1999. Random fluctuations: Unsolved problems of noiseNature 401(Sept. 2):24.

Peterson, I. 2000. Losing to winScience News 157(Jan. 15):47.