Mar 07, 2019 tags: | js | visualisation |
How random can you be?
Let’s play a game. You press ← and → as randomly as you can, and I will try to guess your next input. Use your keyboard or the buttons below (on touchscreen devices).
Every time I guess right, I will take $1 from your virtual account. Every time I guess wrong, you’ll get $1.05. If you are as random as you think, you’ll be making virtual dough in no time. Don’t worry, there is no cheating involved. I simply keep track of the patterns you produce and use them to predict your next move.
Once you’ve become tired of watching your virtual money evaporate, press the “randomize” button below, and I will add 10 pseudo-random inputs on your behalf. On average (i.e. not all the time), this is going to earn you some money. Isn’t it ironic that a pseudo-random number generator is more “random” than you are?
Update: I’ve added five custom events for google analytics to collect statistics on correct guesses after 100, 200, 300, 500 and 1000 inputs. If you’d rather not be counted, please load the html file from my github repository instead. Update 2: I’ve also started collecting the first 180 left-right inputs (anonymously). The purpose of this is to generate test data for an improved model. Update 3: I have more than 10,000 datapoints, so I’ve disabled game stats collection.
So how does it work exactly? Your fingers tend to repeat certain patterns even if you don’t notice it. The program keeps a database of each possible combination of 5 presses, and two counters are stored under each entry — one is for every zero that follows the combination, and the other one is for all the ones that follow this combination. So every time you press a key, an entry in the database gets updated. To make a prediction, the program needs only to look up the entry corresponding to the last 5 presses and decide by looking at the counters which key press is more likely to follow. The rest is up to Fortuna (velut luna). I’ve run this script with 200 pseudo-random inputs 100,000 times, and found that the distribution of correct guesses is approximately normal with µ=50% and σ=3.5% (this agrees with the binomial estimation, of course). The probability of the program guessing your inputs >57% (µ+2σ) of the time purely by chance is very slim, which suggests that you really aren’t good at making random choices.
Edit: Professor Ariel Rubinstein has kindly shared his paper co-authored with prof. Kfir Eliaz in 2011: Edgar Allan Poe’s riddle: framing effects in repeated matching pennies games. I think you will enjoy reading it as much as a I did.
Edit 2: I also recommend watching this video by Numberphile.
Update: I have some statistics collected through google analytics: 2700 data points for 100 key presses and 2000 data points for 200 key presses.
Let’s compare the actual distributions of correct guesses with those expected for random inputs. Please note the actual data includes cases where visitors probed the algorithm with De Bruijn sequences or fed it predictable patterns.
Credits: I got the idea from PBS Infinite Series channel on YouTube (a highly recommended channel!). One of the videos linked this demo which inspired me to experiment with the idea. I didn’t dig deeply, but it seems my implementation (5-grams) ended up the same (or at least very similar). I’ve also tried 3-grams, 4-grams, and adaptive n-grams (n<6), but it seemed they did no better than the current 5-gram model. Although, to be honest, I didn’t have the time to do thorough testing, so I may be wrong. The plot is powered by plotly.js.
Source: github repository