Imagine a little game. I give you a 3 by 3 word search and ask you to find the word FOX.
The thing is, there is no guarantee a fox exists in the word search
I give you. Each grid is constructed randomly, with each square
equally likely to contain an F, O, or
X.
Given a randomly constructed 3 by 3 grid, what is the probability you can find a fox?
Count Them
Well, if we could look at every possible configuration of a 3 by 3 grid, we could just count how many have a fox and how many do not.
Since each square can have one of F, O,
or X, and there are 9 squares, the total number of
combinations is 3^9 = 19,683.
This is a pretty small number for computers. It turns out 8,038 of those grids have a fox, around 41%.
4 by 4 Grids
Well, how about a 4 by 4 grid? We can use the same method: look at all possible grids and count how many have foxes.
With a 4 by 4 grid there are 16 squares, so
3^16 = 43,046,721 possibilities. A lot more, but
still small for a computer. Turns out 34,755,800
grids have foxes, around 81% this time.
As the grid gets larger, the probability of finding a fox gets higher. This makes sense because there are more letters we can use.
5 by 5 Grids
Okay, well how about a 5 by 5 grid? Now we start to run into
trouble. There are 3^25 = 847,288,609,443 possible
configurations. That is nearly one quadrillion,
and not feasible to brute force by counting.
Instead of looking at one square at a time, we can look at entire
rows at a time. There are 3^5 = 243 unique grid rows
and 191 of these don't contain any foxes.
If we construct a grid using 5 of these foxless rows, the only foxes left to worry about are vertical and diagonal. When adding a new row, we only need to consider the two rows above it and make sure we're not adding any foxes.
This blends nicely with a technique called Dynamic Programming that lets us reduce the amount of checks from a quadrillion to around 20 million.
Turns out there's a 96.4% chance to find a fox in a random 5 by 5 grid.
6 by 6 Grids and Beyond
There are 150,094,635,296,999,121 possible 6 by 6
grids. This problem gets exponentially harder to solve as our grids
get larger.
But all is not lost. We can't always find the exact answer, but we can approximate it using Monte Carlo sampling.
We can generate a random grid and check if we can find a fox. We then do this over and over again, millions of times. We keep track of how many foxes we find and, with enough samples, this converges to the exact answer.
Three Dimensional Foxes
What if instead of looking at a 3 by 3 grid, we looked at a 3 by 3 by 3 cube of foxes?
It turns out there's a 92% chance of finding a fox in a three dimensional cube.
But why stop at three dimensions...
Four Dimensional Foxes
Bring out the 4x4x4x4 hypercube! As three dimensional beings we can't actually see a 4d hypercube but this is what it would look like projected into 3d space.
But wait, what even is a line in 4 dimensions? Let's step back to two dimensions.
Lines
Pick a square and then step in a direction. You get a line when you keep taking that same step.
We can write a square as x and y coordinates, like
(0,0). We can represent the step as a vector like
[0,1] which tells us which direction we're going.
To figure out the next point in the line, we just need to add the vector to our point.
Start at (0,0), add the same step, and keep the
points that stay inside the grid.
Because we are in two dimensions, there are two different directions we can step. And we can decide to step in either or both.
In three dimensions our step is of the form [0,1,1],
where each of these three values can be a 1 or a
0.
Start at (0,0,0), add [0,1,1], and
stop when the next point would leave the cube.
Four dimensions is the same. Our steps are defined by vectors of
the form [0,1,1,0]. We just take a 4 dimensional point
and add that step to it over and over, and that creates our line.
Instead of representing our hypercube as a 3d projection, we can also represent it as sixteen 4x4 grids.
Start at (0,0,0,0), add [0,1,1,0],
and stop when the next point would leave the hypercube.
Infinite Foxes
Almost to the end, stay with me.
We can generalize the problem and instead of looking for foxes, we're looking for sequences of numbers.
E.g. we can replace F, O, and X with the numbers 0,
1, 2 in our 3d cube and search for the
sequence 0,1,2. The problem is the same.
We're looking for a 3-length sequence in a 3-dimensional cube.
We can generalize further and look for n-length sequences in
n-dimensional hypercubes. For example imagine looking for the
sequence [0, 1, ..., 100] in a 100 dimensional
hypercube.
What is the probability of finding a [0..n] sequence
in an n-dimensional hypercube as n tends towards infinity?
On one hand, we might never expect to find such a sequence. It
needs to be exactly [0, 1, 2, ..., ∞], the odds of
that are 1 in ∞∞.
On the other hand, high dimensional spaces are exponentially more connected and there are so many more ways we can draw lines. It'd be almost guaranteed some line somewhere had that sequence.
What if somehow these two effects canceled out. What if the increased rarity of the sequence grew perfectly with the increased connectivity of higher dimensional spaces.
In a shocking conclusion, that's exactly what happens. They grow at the exact same rate and the probability converges to a constant.
In an infinite dimensional hypercube, we have a 99.832% chance of finding at least one sequence. No matter how large our cube gets, that number stays the same.
Mathy details for those curious
Let X be the number of valid sequences in a
random n-dimensional hypercube of side n. We
want P(X ≥ 1).
The number of length-n oriented lines in the hypercube is
(n+2)n − nn, and the
chance any single line spells the target sequence in order
is 1 / nn. So the expected number
of sequences is:
E[X] = ((n+2)n − nn) / nn = (1 + 2/n)n − 1
As n → ∞, the term
(1 + 2/n)n approaches
e2, so:
E[X] → e2 − 1 ≈ 6.389
For rare, nearly-independent events like this, the count is
well-approximated by a Poisson distribution
with rate λ = E[X]. The chance of zero events
is P(X = 0) = e−λ, so the chance of
finding at least one sequence is:
P(X ≥ 1) = 1 − e−(e2 − 1) ≈ 0.99832
Conclusion
Now you're ready! I officially dub you a professional fox finder. Your hypercube is in the mail.
From an innocuous word search to a land of infinities, what a strange journey. Thanks for coming.