Sunday, November 17, 2013

Teaching Robots to Sneeze

Modern digital computers are not designed to do unexpected things. When they do occur, these are usually called "failures."

But, as I discussed in The Divine Random in Physics, sometimes it's highly desirable, even necessary, to do something that is universally unpredictable. Or, in other words, to compute a truly random number. Even the phrase is possibly an oxymoron, because if you can compute something, or program a computer to compute it, can that ever be truly random? (I will discuss the relationship between computability and randomness in more detail later on.)

Now, if you've written much software, you may be wondering what the big deal is. After all, every operating system and pretty much every programming language has built-in functions called random() or rand() that generate random numbers for you using deep, complex mathematics that, thankfully, we programmers don't need to understand in order to use the functions. Unfortunately, despite their names, these functions do not produce random numbers in the sense that we have used the term here. In fact, the numbers computed by these so-called pseudo-random number generators (PRNGs) are completely, 100% predictable. They do have a sometimes useful property called statistical uniformity or statistical randomness, but they are in no way truly random. Think of the successive digits of the number pi, which are statistically random, but easily predictable by anyone who can divide the circumference of a circle by its diameter.

The extensive use of such PRNGs in commercial and government cryptography products has caused immeasurable harm, the extent of which is only just beginning to emerge. A few weeks ago, it was revealed that the default and recommended PRNG algorithm, whose formidable name is Dual Elliptic Curve Deterministic Random Bit Generator, used by many commercial companies in the US and abroad contains a backdoor vulnerability that was deliberately inserted by its inventor. That inventor, the US National Security Agency or NSA, now has the capability to predict pseudo-random sequences for any software that uses the tainted PRNG. If this doesn't seem all that important to you, consider this. According to the New York Times article,

The agency has circumvented or cracked much of the encryption, or digital scrambling, that guards global commerce and banking systems, protects sensitive data like trade secrets and medical records, and automatically secures the e-mails, Web searches, Internet chats and phone calls of Americans and others around the world...
Incidentally, the tainted algorithm was adopted so widely, despite its known origins in the NSA, because another US government agency, the National Institute of Standards and Technology, NIST, blessed it as a standard and recommended that it be used wherever possible. Government contractors and vendors supplying cryptography products to the federal government were then required to be certified, which certification included the requirement to use a certain PRNG algorithm.

PRNGs aren't all bad. There are many applications, e.g. Monte Carlo simulation and automated testing, where a repeatable stream of statistically random numbers is just what the doctor ordered. But there are other uses, cryptography being one, for which only the best, genuinely random, numbers will do. Another area where universally unpredictable, truly random numbers are highly desirable is for UUIDs - Universally Unique IDentifiers - used, for example, in massively scalable digital storage clusters that have global namespaces. Here, each stored object, and there may be billions or trillions of them, is assigned a UUID that is different from every other object's UUID. Not only that, it is different from every other UUID ever used or that ever will be used. Thus the term "universal." Such UUIDs become the true names for these objects, in the sense discussed in Satnam and other Dangers.

There are several different ways to generate UUIDs that are guaranteed to be unique. Most of these involve some sort of centralized naming authority. For example, the namespace for Internet URLs is ultimately managed by the Internet Assigned Numbers Authority, IANA. But deterministically generating UUIDs in this way is slow and requires a great deal of cooperation. And did I mention there's still some pseudo-government entity controlling the namespace?

An alternative approach, using large, truly random numbers for UUIDs, can't guarantee uniqueness, but it can make it very, very unlikely that two different objects will receive the same UUID (a "collision"). And if the calculable probability of a collision of truly random UUIDs is smaller than the probability of an undetected hardware or software failure in the deterministic algorithms (see Provably Probable is Better than Probably Provable), then the truly random approach has an important practical advantage over the "deterministic" ones. Namely, each process in a distributed network can, in theory, generate UUIDs independently of every other process; no cooperation with other processes or with a central naming authority is required.

Which brings us, finally, to the main topic of this chapter. How can we program a modern digital computer to do something that is universally unpredictable — truly random? How can we teach a robot to sneeze?

Consider the following programming problem. We have four server-class computers, no keyboards or mice attached, all plugged into the same power strip. The servers are all identical - same hardware, same memory, same peripherals. All are networked together through a single network switch. The power strip is switched on so that all four computers start at the same moment and they all boot the exact same operating system, say Linux, and then synchronize their clocks. They then auto-execute an application program whose sole task is to generate a 128-bit truly random number and share it with the other servers. Each server stores the four numbers generated in this round, then compares its number against all previously recorded ones. If its generated number is different from every other number ever generated by any of the servers, it reboots itself and the whole process starts over again. If there is ever a duplicate number generated, an alarm goes off, lights flash, sirens wail, and the V.P. of Engineering gets a frantic phone call in the middle of an otherwise relaxing and much needed vacation in Taos, New Mexico.

There are several different ways to solve this problem, and some are better than others. A naive (meaning terrible) approach would be to simply look around the system to find "random" or unpredictable values, then hash them together to form a candidate number. Simply scrambling the value of the real-time clock would generate a number that is statistically unique across reboots, but every server would likely generate the same value, since their clocks are synchronized. You could mix in other more or less random values, like the serial numbers of peripherals, if available, or the MAC addresses of the network interface cards. But these numbers are static across reboots, so the same machine would find the same values each time and from them generate the same candidate number. Also, these numbers are unique only if you trust the central naming authorities, usually manufacturers associations, that assigned them. Specifically in the case of MAC addresses for NICs, manufacturers are assigned blocks of numbers they then use in serial fashion for their network cards. Since our servers come from the same manufacturer and perhaps the same batch, there is a good chance the MACs are very close to one another. Contrary to common belief, MAC addresses can also be reused over time.

A better all-around approach is to use measured timings for hardware-related events, which relies in one way or another on detecting and measuring small discrepancies (a.k.a failures) in the hardware. The Linux operating system (and most commercial Unix variants) includes a random number generator like this in its kernel. This mechanism measures tiny variations in interrupt timings in various hardware drivers, including keyboard, mouse, hard disk, and network drivers, to accumulate an "entropy pool" of randomness. When an application wishes to use a random number, it simply reads from a special device called /dev/random and asks for some number of bytes of randomness. If there is sufficient entropy in the pool at the time of the request, the kernel returns the requested number of bytes of randomness and reduces the pool by the same number of bytes. If there is not sufficient entropy yet, the read from /dev/random will block until enough randomness accumulates.

The word entropy here is an allusion to Claude Shannon's idea of entropy in communication theory. The actual calculation of this supposed measure of randomness in the Linux random number generator is somewhat mysterious and ill-understood. Even though the source code for it is publicly available, there seems to be no good explanation for why it might work, in theory. Nevertheless, it is clear that the best sources for randomness are the human interfaces; the keyboard and mouse. Ironically, the best way for a robot to do something unexpected is to ask a human! For our sample network of servers, which is based on a real world cluster of storage servers I once helped design, there are no human interfaces from which to draw randomness. Analysis has shown that other sources, notably hard disk drivers, are very poor sources of randomness, generating something like one bit of entropy per minute of operation. It seems the Linux random number pool is exceedingly shallow, which is once again ironic given that we know randomness is not a kiddie pool in the middle of our otherwise deterministic universe, but rather a vast ocean of unpredictability surrounding us all.

There is a way to draw randomness directly from that ocean by using specialized devices called Quantum Random Number Generators, QRNGs. As the name suggests, these devices tap into quantum-level phenomena, including the mother of all weirdness, quantum entanglement, to pull randomness from between the grid lines of the rational universe. Unfortunately, these devices are generally not a part of standard, off-the-shelf, computers like the ones we have conjectured for our example network. Given the importance of this kind of functionality for effective cryptography, that's a little odd, don't you think? If I were a conspiracy theorist, I would suspect the NSA has had a part in this omission.

But the sad fact is, to solve our proposed puzzle using only commodity servers, we cannot rely on fancy QRNGs, at least not today. The question now boils down to this: How can we tap into the realm of quantum-level events using ordinary hardware?


One answer my colleagues and I came up with is to measure the tiny irregularities in the heartbeat of every digital computer, the CPU clock. This clock is derived from a very mundane source: a physically vibrating quartz crystal — basically an atomic tuning fork. Most modern CPUs use a crystal oscillator that vibrates in the neighborhood of 30-80 million times per second and that signal is then multiplied electrically to produce a clock frequency of three or four billion cycles per second, which is the rate at which the computer can execute its simplest instructions. Although modern crystal oscillators are highly stable in their frequencies, things like temperature, humidity and external vibrations do cause small variations to occur, and when the frequency is multiplied by electrical circuits, so is the noise. That's usually okay, since CPUs are remarkably tolerant of lower or higher clock frequencies — most will run just fine at half the clock rate and can generally be overclocked a bit to run faster. But if we could measure these small variations in the vibrational frequency of chemical bonds between individual atoms of a crystal lattice, that would be a good source of quantum randomness. The challenge there is, how can a computer measure variations in its own heartbeat without some sort of external reference?

Luckily, there is a second clock within most every modern computer, one that uses an independent crystal oscillator as its frequency source — the real-time clock or RTC. It is quite important for different computers within a network to agree on what time it is, or at least on how long a second is. To do this, most computers use a much lower-frequency, and less noisy, crystal oscillator operating down near the 100-thousand cycles per second range to measure the passage of time. Of course, this second crystal is subject to the same kind of noise as the CPU oscillator, though less so because of its lower frequency. But that's good for our purposes! We can simply count how many CPU clock cycles occur within each RTC cycle, subtract successive samples and contribute the least-significant few digits to the entropy pool. Conceptually, this sampling can be done by executing a simple instruction loop a few million times and asking the RTC how long it took to complete. Given the realities of things like hardware interrupts and process scheduling, the details are of course more complicated, but the solution remains practical and it works for almost any commercially-available digital computer.

Computers do exactly what we tell them to do. When they don't, we typically think of it as a failure. But unexpected does not always mean faulty. Not only is it useful for software developers to plan to fail when designing complex systems, it is sometimes necessary to actually elicit and embrace failures, to swim briefly in the ocean of randomness. Randomness, one could argue, is at the very core of humanity and human creativity, and it isn't possible to program a computer to do something completely unexpected. To teach a robot to sneeze, we must allow her to surprise us.

Wednesday, August 28, 2013

The Divine Random in Physics

A place where gently blowing semi-trees
Stand constant in the changing sands,
And pretty birds flying flawless paths
Sing palindromic melodies
A while back, I sat in a conference room with a group of smart software engineers discussing randomness. Our new distributed software product relied very heavily on being able to generate truly random numbers on generic digital computers, a surprisingly hard problem as it turns out. The question had arisen, how sure were we that the numbers we generated were really, truly random? And by the way, what the heck does that actually mean?

Star System Arp 194 (Courtesy NASA)
The Question Mark Galaxy

Someone posited that a truly random number might be one that is universally unpredictable, a definition I quite liked. But instead of saying I liked it, I flippantly suggested there might be no such thing, that given enough information and a big enough computer, we might possibly predict the outcome of any conceivable physical process. Two members of my team (defensively, I will point out they both hold PhDs and very large brains) immediately spoke up to point out how wrong I was and exclaimed in unison, "Bell's Theorem!" I was simultaneously mortified that I had made such a mistake and proud that not one but two members of my team knew enough about quantum metaphysics to correct me. In hindsight, it isn't surprising that it was the concept of randomness that took us so quickly from a mundane discussion of Bug 1356 to the esoteric nuances of life, the universe, and everything. Here's how we got there.

Randomness, as a natural phenomenon, was discovered in 1927 by a young German physicist named Werner Heisenberg. Prior to that time, the universe was completely predictable, or at least that's how we viewed it. Physicists from Newton to Einstein all believed if we just looked hard enough we would find the rulebook of the universe, the answer to everything. And even if that answer were more complicated than, say, 42, it would still be possible to predict the outcome of any physical process, as I had suggested in the engineering meeting. The most brilliant minds the human race had produced all rejected the idea that the universe might be a giant game of chance. Albert Einstein famously rejected it explicitly. Then came the Heisenberg Uncertainty Principle.

The Uncertainty Principle basically says, certain pairs of properties of physical objects — simple things like where it is and how fast it's going — cannot be simultaneously measured with perfect precision. The more carefully you measure the position of, say, an electron, the less certain you can be about its velocity at that same moment. If you are very, very, very careful measuring the position, then whatever number you observe for the velocity is essentially meaningless; it is random beyond a certain number of decimal places. Now this limit on how accurate one can be with these combined measurements is quite negligible for larger objects like bowling balls or BBs, but for small things like electrons and photons it makes a difference. The combined limit on our accuracy of measurement is determined by the reduced Plank constant which is about 35 decimal places of accuracy. Beyond that, physical properties are universally un-measurable. This can be understood by thinking about how one's measurements affect the object being measured. Measuring the position of an electron involves shining a light on it, and a more accurate measurement requires shorter bandwidth, higher energy photons. When the electron is impacted by the high-energy photon, its velocity is affected, thus introducing randomness.

And that is the way it was presented and talked about at first, as a limit on experimental accuracy. The Quantum Physics textbook I used in college, the 1974 edition of Quantum Physics, by Eisberg and Resnick, explained the Uncertainty Principle by saying, "our precision of measurement is inherently limited by the measurement process itself [...]." Albert Einstein, and many other prominent contemporaries of Heisenberg, believed there must still be an underlying set of "hidden variables" that control the universe and provide precise, deterministic answers to any question, even if we were forever limited in our ability to experimentally verify those answers due to the Uncertainty Principle.

Einstein, together with his colleagues Boris Podolsky and Nathan Rosen, even wrote a famous paper in which they, almost mockingly, proved that Quantum Mechanics must be wrong, or else the world as we know it would be a truly strange place. To do this, they assumed only two seemingly obvious things about the world. First, that objects have intrinsic properties like position and velocity, even when no one is measuring them. This they called "reality." And second, that measurements of reality in one place and time cannot instantaneously affect other, far away realities, a property they called "locality." Einstein, Podolsky and Rosen basically said, who would want to live in a world where reality and locality did not hold. In other words, they believed our friendly, orderly universe could not possibly be intrinsically random.

But they were wrong.

In 1964, Professor John Stewart Bell proved a result that some have called, "the most profound discovery of science." The unassuming title of his brilliant paper, On the Einstein Podolsky Rosen Paradox, referred back to the "paradox" outlined by Einstein and his pals. Bell proved that the universe is in fact fundamentally, inherently, inescapably random. More precisely, he showed that no deterministic theory based on hidden variables could possibly explain all the observed results of Quantum Mechanics. And if that means there is no such thing as reality or locality, then so be it. Either the principle of reality or the principle of locality (or both) does not apply in our universe! A strange place indeed.

And so my brilliant colleagues were right. Heisenberg's Uncertainty Principle is not just a limit on how accurately we can measure things. It's a limit on what we are allowed to know about the universe in which we live. There are physical quantities that are universally unpredictable. At the very foundation of our familiar physical world, lies the Divine Random.

Thursday, November 1, 2012

Truth Dealers

There will always be war and politics. Just as the ready availability of weapons escalates and prolongs wars, the ready availability of information escalates and prolongs political battles.

As the 2012 U.S. election season draws to a close, there have been many possible reasons proffered for why this one seems to have been particularly, historically divisive. The battle lines between Democrats and Republicans have grown deeper, wider, and more extensive than for any other election — at least any I can remember. This, despite the facts that, 1) the two parties have nearly identical ideologies, and, 2) most voters do not fully support the published platforms of either party.

One of the reasons I've heard suggested to explain this latest battle in the ideological war that began, really, back in the photo-finish election of 2000, is that we have somehow lost our respect for the truth. He who tells the biggest whopper gets the most votes. In The Truth and Nothing But...A Good Story, I wrote about how partisan lies often spread like wildfire through the Internet, propagated by people who are either too lazy to check the validity of the facts, or too cynical to care. In the latter days of the campaigns, this overt dishonesty isn't limited to the Internet, with TV and radio ads now claiming things that are blatantly, provably, unequivocally false.

What's more maddening, nobody ever seems to want to retract these lies once they're caught. The phenomena of "doubling down" and "tripling down, basically repeating the original lie louder and to a broader audience, has become epidemic. Why? Because it works! And here's how...

Think of political operatives as carnival barkers. Their job is to draw attention to a certain set of facts that favor their candidate. But the facts themselves may be dry and boring without some dramatic context. "See the man with no stomach!" they scream. The actual fact is, there's a skinny guy who can suck in his gut to the point where his spine is visible from the front, still worth the price of admission perhaps (if you're interested in such oddities), just not as catchy and seductive as the shouted hook would have you believe. Some customers, after viewing the side-show, may complain about the lie, pointing out the original claim was misleading and blatantly, provably, unequivocally false. "Pants on fire!" they exclaim. But who cares? They have witnessed something they probably wouldn't have seen otherwise, and the carnival has their $1 so everybody's happy. Meanwhile, the barker continues to shout even louder, "See the man with no stomach!"

Even carnival barkers have legal restrictions that prevent them going too far in making false claims and false implications. Under certain circumstances, their utterances can be considered to be advertising, which is regulated by the Federal Trade Commission and by similar state agencies. Under penalty of law, advertisers are not allowed to make false claims, or even misleading ones, about their products or their competitor's. The FTC defines false advertising to include the following:

means of advertisement [...] which is misleading in a material respect; and in determining whether an advertisement is misleading, there shall be taken into account (among other things) not only representations made or suggested [...] but also the extent to which the advertisement fails to reveal facts material in the light of such representations.
In this campaign cycle, there have been many, many political ads, including the one linked above, that clearly would have violated FTC laws if the product being sold had been, say, soap instead of a candidate for political office. Why do we hold carnival barkers and purveyors of dish soaps to a higher standard of honesty than our presidential candidates? The answer is, the First Amendment to the Constitution of the U.S. The courts have held that political ads cannot be considered to be advertising. They are political speech, which is protected under the First Amendment. As a result, it is legal for anyone to tell any political lie, however outrageous, as loudly and often as they wish.

And, honestly, how else could it work? We certainly don't want the government telling us which political claims we should believe and which ones we should discount. The problem is really a practical one stemming, ironically, from the success the FTC has had in their enforcement of truth in advertising. I've heard people say things like, "There must be some truth in that political ad, otherwise they wouldn't let it be shown on TV." This "nanny expectation" is completely unfounded when it comes to the strange world of political advertising and no comfort whatsoever can be drawn from it. (Interestingly, the truth-in-advertising expectation does not seem to extend to the Internet, even though the same FTC laws apply there as on traditional broadcast media.)

Legally, practically, and in every other sense, it is the responsibility of each individual voter in the U.S. to research campaign ads they may find compelling to discover whether they are indeed truthful; a daunting task. The only reasonable alternative is to rely on a trusted source to do the fact checking for you. There are now legions of so-called fact checkers, and nearly every claim made by every major political candidate is combed over, investigated, parsed and re-parsed to try to determine whether it is true, false, or somewhere in between. Campaigns often shoot for that middle ground, the grey area where claims can still be sensational, provocative, even mildly misleading, without actually being so blatantly false as to reflect poorly on the candidate's public character and facade of honesty. They seem to follow some sage advice I received early in life.

How does one become a good liar?
Tell the truth almost always.

The truth is a fluid and relative thing these days. It depends on whom one chooses to believe. Even things we once called "facts", things like, "this person was born in the U.S.," or, "this person is a Christian," or, "this person never cheated on his taxes," all are now written in various shades of grey. It's not that we treasure the truth any less than before; if anything, just the opposite is the case due to increased skepticism. It's not that there are too few facts; the Internet, among other things, now provides a treasure trove of information about every conceivable topic.

The challenge, IMO, is that there are so many facts, so much information available to each of us. Facts are weapons in the political wars that currently rage in this country, and fact checkers are the arms dealers who make money selling to both sides of the conflict. Like weapons, there are high-quality facts and low-quality ones. Facts purchased from the "fact checkers" at Fox News or MSNBC are somewhat more likely to jam on you in the heat of battle or blow up in your face like a crude IED. But these aren't the really dangerous ones. The reputable arms dealers, the neutral fact checkers and think tanks sell large quantities of durable, usable information to either side for a price. In political skirmishes these days, you cannot be lulled into a false feeling of security because you are armed with the facts. The other guys are probably packing heat as well, just as much as you and maybe more. Standing up to declare victory because truth is on your side is a sure way to become a bloody casualty.

I am not advocating that fact checkers be outlawed, of course. Nor am I saying that free and open information is bad. The NRA has the right idea, facts do not kill candidates, ideologies kill candidates. An IED in the hands of a radical ideology can do more damage than an aircraft carrier in the hands of a civilized one. Wars and political debates are won not with weapons or facts, but by leaders who know what they are defending, whose loyalties are steadfast, and whose vision extends far beyond the next battlefield.

Saturday, September 15, 2012

Consensual Wisdom

I have a theory that almost all of us agree on almost everything. But we're wired somehow to zero in on the things we disagree about, however minor they may be, and debate/argue/fight over them. It seems the larger our base of common understanding, the more intense the disagreements over the details. Sunnis and Shiites, Christians and Muslims, Republicans and Democrats, kill each other (figuratively and sometimes literally) over the smallest ideological differences, despite their enormous shared premise. As a result, we ignore the big problems that challenge us all and focus instead on the minutiae of our dissonance.

But what if there were an environment where we were rewarded for finding our kernel of consensus? A game, say, where the rules guide us toward addressing the problems we all agree exist. An economy where one becomes individually wealthy only by taking actions that benefit everyone. A society that can grow as large and diverse as necessary and still make progress toward common goals.

I'm not talking about Utopia here. I'm just asking, is there a set of rewards and punishments, however artificially enforced, that will lead a group of people to focus on their considerable shared world view rather than their comparatively tiny differences?

Science, politics, and religion all have enormous edifices of consensual wisdom. Evidence and ideas that diverge from or seem to contradict those shared bodies of knowledge induce different reflexes in each discipline. Religious thinkers tend to reject them, scientists embrace them, and politicians use them as leverage. That's all fine, but the incessant preoccupation with what makes us different tends naturally to fragment us, to balkanize our churches, our research disciplines, and our societies until, in the limit, we will be left completely harmonious but quite alone. Social entropy will be at its maximum.

None of our institutions have evolved mechanisms of inclusion, systems that seek to expand groups by including more and more people who actually disagree with us. It's so rare that you may not even see the point in doing so. Maybe I am being hopelessly naive here, but it seems to me that recognizing our kernel of consensus, the common knowledge, shared problems, and joint solutions becomes more and more valuable as the size of the union gets larger, even if the intersection becomes smaller; particularly so if my hunch is correct that almost all of us agree on almost everything.

Perhaps I'm wrong though. Maybe, if all the religions of the world were to somehow identify their kernel of consensus, it would turn out to be trivial; something stupid and obvious like, we all have an ethical core and we should all give ourselves permission to obey it. But wait. Is that really so trivial? I, of course, don't know what the intersection of all our beliefs would be, but imagine the incredible power inherent in being able to make a statement like that! "We, the combined theological intelligentsia of the entire planet, believe the following to be true..." Imagine the societal and spiritual benefit that might ensue from simply communicating that consensus, whatever it turns out to be. I believe that benefit might far outweigh whatever good comes from arguing about who has the coolest prophet and whether we should utter its name or view its likeness.

I don't mean to pick on religion here. In many ways science is even less concerned with consensus and more obsessed with the deltas and outliers. That's what they do for a living, and it's what makes it so difficult for societies to choose courses of action based on reason and scientific consensus at a point in time. Many scientists believe it isn't their jobs to even try to identify consensus among themselves, rather they should provide politicians and decision-makers with all the possible alternatives and let them make decisions about what, if any actions to take. I think this is an abdication of their social responsibility. We are left to guess, infer, and in many cases, attempt to obfuscate scientific consensus on important topics.

Is the Earth's climate changing at a rate unprecedented in recorded history? I think I'm safe in saying almost all climate scientists agree that it is. Is this climate change anthropogenic (caused by human activities)? Maybe there's a consensus answer to that question, maybe not. Does evolution exist? Absolutely it does, in the same sense that quicksort (a well-known sorting algorithm) exists. Is it sufficient to explain the diversity of life on the planet? Without inviting criticism and controversy, I can't say whether there is a scientific consensus on that question. My point is, there are no conventional mechanisms in science for reaching and communicating conventional wisdom. Claiming that no consensus exists because one or a few scientists disagree is fallacious. Consensus does not require unanimity. It does, however, require a collaborative mechanism of discovery and refinement.

No, I'm not having a Kumbaya moment. My advanced age guarantees I have a healthy daily dose of skepticism and pessimism. The events of the past few days have done absolutely nothing to make me suddenly feel optimistic about our ability to live together without killing each other for nonsensical reasons.

But even with all this negative energy swirling around today, I'm left with an intellectual curiosity about the central question of this post: Now that the Internet has given the world a way to communicate instantly, can we also find a way to agree?

Tuesday, August 21, 2012

Introduction to The Divine Random

Come travel with me, far from now and here,
To a niche carved into entropy.
A perfect island, in a magic sea,
Where only miracles and fools are feared.
Does true randomness truly exist?

We know that randomness is useful, essential even, in solving a number of important problems in distributed computing, social choice, and identity. It is also the case that truly comprehensive solutions to those problems require true randomness, not the more limited kind one typically finds in digital computing.


A Julia Island
(Courtesy Wikimedia Commons)
The question of whether such universally unpredictable events actually exist in our universe gets very deep very fast. It quickly bumps up against some of the most enigmatic and fundamental principles of physics, mathematics, computation, and theology. Together, these principles define the boundary of the known universe. And by that, I don't mean just the limit of what we know today, to be extended as we learn more tomorrow. I mean the absolute limits of science and reason, beyond which we can never venture, no matter how clever we are.

For a long time, I've been fascinated with a set of theorems that seem to define that boundary. They include the following:

  • Heisenberg's Uncertainty (Physics) — There is a limit to how accurately we can measure the properties of physical objects.
  • Bell's Inequality (Physics) — That limit applies not just to our ability to measure things accurately, but to our fundamental ability to know things about physical objects.
  • Gödel's Incompleteness (Mathematics) — Any attempt to explain everything using a small(er) set of axioms is doomed to be either unfinished or wrong.
  • Turing's Undecidability (Computing) — There are infinitely many problems that cannot be solved by any digital computer.
  • Chaitin's Irreducibility (Computing & Mathematics) — Almost every number (probability = 1) is "random" in the sense that it cannot be computed by an algorithm that is much shorter than the digits of the number. That is, the shortest name for the number is the number itself. Randomness exists in mathematics as well as physics!

I will talk more about these ideas and theorems in later posts in this series (informally of course - please don't confuse me for a real physicist or mathematician). For now, notice the last word in their names. Each of them expresses a negative. Each of them tells us something about what we cannot do, where we cannot go, what we cannot, under any circumstances, know.

Intuition suggests these five principles, despite their different fields of application, are somehow related. In fact, they seem to be exactly the same, or at least stem from the same underlying phenomenon. Namely this:

Fundamental randomness exists. It isn't just a small wart on our logical world, but rather an unfathomable ocean surrounding it. We cannot ever know what goes on in this ocean of randomness. We can only glimpse the shape of the coastline of our little island of reason, as the theorems and principles listed above begin to illuminate.

Thursday, July 19, 2012

And the Winner is Probably...


In Provably Probable Social Choice I talked about the Gibbard-Satterthwaite theorem and its somewhat depressing conclusion that no deterministic voting system can completely prevent dishonest/tactical voters from unfairly influencing elections. Also in that post, I compared this social choice theorem with a similar result in distributed computing, Byzantine Agreement. In both realms, it seems that randomness must be an essential ingredient in any full solution.

There are many good, efficient solutions to various distributed computing problems that are probabilistic in nature. But what about random voting? The very idea seems silly. But is it?

In fact there has been a good bit of recent research into probabilistic voting algorithms that is starting to yield some interesting results.  For example, weighted lottery voting basically puts all the votes into a hat and chooses one at random. The single ballot chosen determines the winner. Surprisingly perhaps, this simple probabilistic procedure may be a good way to elect legislatures with multiple representatives from multiple districts. One reason is that it is completely immune to tactical voting.

Suppose that, for a given weighted lottery election, candidate A has 60% of the votes, candidate B has 30%, candidate C has slightly less than 10% and candidate D has only a single vote. With this voting procedure, the favored candidate will probably win the election. But there is some chance A will lose in favor of B or C. And there is even a small chance that candidate D, with only a single vote, will win! I know what you're thinking — this isn't practical. But think what happens when we do this for each member of a legislature, say a hundred representatives, all elected from their individual districts using weighted lottery voting. In the end, the distribution of representatives will conform to the distribution of voters across all districts. Yes, there may be some anomalies in the state house, so to speak, but if you trust the laws of probability (and I do), the overall make-up of the legislature will reflect the opinions and expressed wishes of the electorate. Now, I don't know of anyone who has seriously suggested that weighted lotteries be used in real world elections, but it is nonetheless tantalizing to look at some of the practical advantages to be gained by doing so.

First, like most probabilistic algorithms in the field of distributed computing, this procedure is dirt simple. There is no tallying of votes, no backroom subjective judgement about which ballots to count and which to ignore, no arithmetic of any kind involved in determining who the winner is. You simply put all the ballots into a basket (physically or electronically), shake it up, and then have a good-looking spokesmodel choose the winner. This procedure will likely promote higher voter turnout. With most of the deterministic voting schemes in use today, a voter who finds herself in an extreme minority may decide to skip the whole voting hassle if she feels there is absolutely no way her vote will matter. With the probabilistic scheme, however, there is always a possibility her vote will not only matter, but completely determine the outcome of the election! Similarly, there will be higher candidate participation because this algorithm (unlike winner-take-all deterministic plurality algorithms) does not inherently favor a two-party system. It is always possible for a third party to win an election.

The most promising advantage of probabilistic voting schemes like this one is that they are impossible to "cheat." More precisely stated, the best way to get your favorite candidate to win is to actually vote for him. This is not the case with any deterministic voting algorithm if there are four or more candidates. You might say, well, cheating isn't very common because it's so hard for individual voters to devise effective cheating strategies. Not so. In fact, there is one kind of cheating that is especially widespread and particularly destructive: the gerrymander. Ruling parties, with the help of computer software, commonly re-draw district lines so as to minimize or completely remove the voting influence of those who would vote against them. This practice is completely legal, completely constitutional, and completely despicable. And it works solely because of our choice of voting systems. Perhaps the most egregious example of gerrymandering in the entire US occurs in my home town of Austin, Texas. In congressional District 25, shaped like the practice's namesake salamander, Austinites share a voting pool that stretches hundreds of miles to the South, all the way to the Mexican border. No one disagrees the only purpose for this strange arrangement is to cheat in the congressional elections. A probabilistic voting scheme, like weighted lottery voting, would completely eliminate the gerrymander. As long as districts contain an equal number of voters, as required by the Supreme Court, the ideological composition of the legislature will, with high probability, reflect the ideological distribution of the voters.

But this simple probabilistic voting scheme is only remotely practical for legislative-type elections, where the, perhaps, negative effects of an improbable choice of representative can be diluted by all the other members of the legislature. The method is ill-suited for executive elections where a single, powerful governor or president is chosen by chance. For those kinds of elections, you really want a probabilistic algorithm that usually honors the majority choice, if there is one, and steps in with a coin flip only when it needs to. One example of that sort of algorithm is called double range voting.

With this clever, though more complex, voting procedure, each voter casts two ballots (or one ballot with two sub-ballots), awarding preference points in a given range, say zero to five stars, to each of the candidates. The first range ballot is tallied to identify the first-, second-, and third-place winners. Usually, the first-place winner is chosen as the winner of the whole election and we're done. But there is a (tunably small) chance the algorithm will ignore the first-place choice and pick the winner to be one of the second- or third-place candidates instead. In that case, the second ballot is used to make the binary choice. Since binary choices are immune to the Gibbard-Satterthwaite limitation, there is no reason for voters to vote tactically on the second ballot, and since it isn't possible to know which candidates will be #2 and #3, the second ballot may just as well be a completely honest reflection of the voter's choices among all candidates. Obviously, this is not a mathematical proof, but I'm sure you get the idea. The outcome is, usually the majority candidate wins, but sometimes the second or third most favored candidate wins, and as reward for suffering this small uncertainty, the election becomes immune to tactical voting.

These and other probabilistic voting procedures all suffer the same drawback: lack of transparency. When the outcome of an election involves a random choice, how do you prove ex post facto that the choice was indeed random? If I randomly choose a number, say 42, how can I prove to you my choice was just as likely as any other integer? How do I prove that the good-looking spokesmodel didn't use sleight of hand to deliberately choose a certain ballot out of the basket? More practically, how do I prove that the computer-generated random selection was indeed, random? If I use a physical process to generate a random selection, like repeatedly flipping a "fair" coin, how can you be sure the coin was really fair? And for the deeply paranoid (or inquisitive) among us, how do we know there is even such a thing as randomness? Maybe everything in the universe is predictable, if you just have a big enough computer.

My next few posts will examine these kinds of questions from the points of view of computer science, physics, mathematics, and theology. Given the demonstrated importance of randomness in distributed computing and social choice, we should at least investigate whether it actually exists.

Wednesday, June 6, 2012

Democracy at Scale

Democracy is a flawed concept. Everybody knows it, and we have known it for many, many years. The principle problem with a pure democracy is, it doesn't scale. That's why large democratic institutions, notably the United States of America, are always something sort of like a democracy, but not quite. The U.S. is, of course, a republic. What's the difference between a democracy and a republic? James Madison explained it best in The Federalist #10.

The two great points of difference between a democracy and a republic are: first, the delegation of government, in the latter, to a small number of citizens elected by the rest; secondly, the greater number of citizens, and greater sphere of country, over which the latter may be extended.

The effect of the first difference is, on the one hand, to refine and enlarge the public views, by passing them through the medium of a chosen body of citizens, whose wisdom may best discern the true interest of their country, and whose patriotism and love of justice will be least likely to sacrifice it to temporary or partial considerations.

If that last part doesn't make your heart ache with nostalgia then you haven't paid much attention to U.S. politics for a long time. I personally cannot remember a time, and perhaps no one alive can remember a time, when the Congress of the United States consisted chiefly of citizens with enough wisdom, patriotism, and love of justice to refuse a lobbyist or vote against a good pork barrel spending bill in order to get reelected.

What happened? Are people fundamentally more corrupt than in the past? No, we are generally as selfish as always. The problem is, we have outgrown our system of government. Again. The brilliant idea of a republic, electing a small body of representatives who then govern among themselves by direct democracy (more or less), is inherently more scalable than pure democracy, but it also has its limits. Our numerous modern representatives — presidents, electors, congressmen, judges, mayors, city council members, constables (whatever those are) and so on — are often elected based not on their character, but on superficial bases like parentage, wealth, party affiliation, and hair style. That's largely because nobody has the time to really know and understand who these people are. And so we bestow enormous, disproportionate voting power to people based on silly criteria, including what they say they'll do in office, because we just don't have the time to examine their history to determine what they really believe in.

The answer, or at least one answer, can be found in the Ethosphere. Instead of voting for people to represent us, we might vote only on ideas, or their written embodiment that we call props. It's much easier to decide whether you are for or against an idea, a proposal, or a proposition, than it is to decide whether a complex person will be more or less likely to represent your own views. In order to scale, there will still be some citizens who have more voting power than others, but these representatives will be chosen based solely on their history of constructive participation in whatever society you both choose to be members. Citizens who write props that are eventually ratified by the voters at large will receive more voting power, as a side-effect of the normal activity of the group. Representatives can get elected only by constructive participation, not by campaigning.

This idea of conveying greater resources, in this case voting power or rep, to some individuals in order to increase the welfare of the entire group was first identified and discussed by a nineteenth century economist named Vilfredo Pareto. Pareto Inequality is exactly this somewhat paradoxical idea that the overall social welfare (measured by some metric) of a society can sometimes be increased by bestowing special powers or additional resources to small numbers of individuals.

The question now becomes, can a reputational voting paradigm like the one I discussed in Building Ethos out of Logos actually result in a Pareto inequality and benefit the group as a whole by allowing the votes of those individuals with higher reputations to count more than others? To try to answer that question, I wrote a simple simulation. For the geeks out there, I'll describe the details of the simulation in a separate post. In general terms, it mimics the activities of five independent populations with 10,000 members in each. Props are proposed by randomly-chosen members and each member then votes on them based on its own internal preferences. A graphical summary of the results can be seen below. You might want to click on the graph to see the full-sized version.


The three plots display a measure of overall displeasure or regret after each of 1,000 props have been considered, voted upon, and either ratified or not. Regret is the inverse of social welfare, so lower regret numbers mean the system is doing a better job of identifying the population consensus. Under ideal conditions, when every voter is fully informed and there is 100% voter turnout, the result is the blue curve, which shows very low average regret. In other words, pure democracy works great when everyone participates and they have all the necessary information to accurately vote their individual preferences. Unfortunately, this is rarely the case for large populations.

The red curve shows what happens under more realistic circumstances, where only 40% of the population votes and only 25% of them are fully informed. Notice the overall regret numbers increase significantly, meaning more voters are more unhappy with the election results. The regret measurements include all voters, even those who didn't vote and those who didn't really understand what they voted for/against. As in real life, everyone has an opinion, even those who are too busy, too lazy, or too ignorant to express it by voting.

The green curve introduces reputational voting for the first time. We leave turnout percentage and informed percentage at 40% and 25% respectively, but each voter gets a boost in its reputation when it authors a prop that is eventually ratified by the population at large. Its rep is decreased when one of its props is rejected by the voters. Of course, each voter votes its current reputation, so voters who have more success historically in capturing the consensus of the whole group will eventually have higher reps and, therefore, disproportionately high voting power. As the green plot shows, this strategy starts off no better than the realistic scenario, but gradually gets better (lower regret numbers) until it has more or less completely compensated for the low turnout and poorly informed electorate. Thus, reputational voting does indeed implement a Pareto inequality by benefiting the population as a whole when additional resources (voting power) are granted to a small number of individuals.

There are other practical advantages of reputational versus representational (e.g. republics) systems that the simulation cannot show. Reputation is dynamic and continuously computed, which leads to a more robust system. Term limits are no longer necessary because reps increase and decrease over time based on recent history of constructive participation. Even when populations have shifting consensus opinions, which often happens, the reputation system is robust enough to shift with them. Also, reputation is computed as a side-effect of doing real work, proposing and voting on ideas, rather than as the side-show beauty contests representative elections often become. As I discussed in "Jane, You Ignorant Slut", it seems nobler to discuss ideas than people.

In order to bring democracy to the Internet, we will need to teach it to scale. As the above simulation shows, reputational voting is a straightforward mechanism that can help us do that.