Imagine a little game. I give you a 3 by 3 word search and ask you to find the word FOX.

Click on the letters to spell FOX in a straight line.

The thing is, there is no guarantee a fox exists in the word search I give you. Each grid is constructed randomly, with each square equally likely to contain an F, O, or X.

Given a randomly constructed 3 by 3 grid, what is the probability you can find a fox?

Count Them

Well, if we could look at every possible configuration of a 3 by 3 grid, we could just count how many have a fox and how many do not.

Since each square can have one of F, O, or X, and there are 9 squares, the total number of combinations is 3^9 = 19,683.

This is a pretty small number for computers. It turns out 8,038 of those grids have a fox, around 41%.

4 by 4 Grids

Well, how about a 4 by 4 grid? We can use the same method: look at all possible grids and count how many have foxes.

With a 4 by 4 grid there are 16 squares, so 3^16 = 43,046,721 possibilities. A lot more, but still small for a computer. Turns out 34,755,800 grids have foxes, around 81% this time.

As the grid gets larger, the probability of finding a fox gets higher. This makes sense because there are more letters we can use.

Board Number

5 by 5 Grids

Okay, well how about a 5 by 5 grid? Now we start to run into trouble. There are 3^25 = 847,288,609,443 possible configurations. That is nearly one quadrillion, and not feasible to brute force by counting.

Instead of looking at one square at a time, we can look at entire rows at a time. There are 3^5 = 243 unique grid rows and 191 of these don't contain any foxes.

If we construct a grid using 5 of these foxless rows, the only foxes left to worry about are vertical and diagonal. When adding a new row, we only need to consider the two rows above it and make sure we're not adding any foxes.

This blends nicely with a technique called Dynamic Programming that lets us reduce the amount of checks from a quadrillion to around 20 million.

Turns out there's a 96.4% chance to find a fox in a random 5 by 5 grid.

6 by 6 Grids and Beyond

There are 150,094,635,296,999,121 possible 6 by 6 grids. This problem gets exponentially harder to solve as our grids get larger.

But all is not lost. We can't always find the exact answer, but we can approximate it using Monte Carlo sampling.

We can generate a random grid and check if we can find a fox. We then do this over and over again, millions of times. We keep track of how many foxes we find and, with enough samples, this converges to the exact answer.

0 samples
-- estimate
0 FOX 0 NO

Three Dimensional Foxes

What if instead of looking at a 3 by 3 grid, we looked at a 3 by 3 by 3 cube of foxes?

It turns out there's a 92% chance of finding a fox in a three dimensional cube.

But why stop at three dimensions...

Four Dimensional Foxes

Bring out the 4x4x4x4 hypercube! As three dimensional beings we can't actually see a 4d hypercube but this is what it would look like projected into 3d space.

But wait, what even is a line in 4 dimensions? Let's step back to two dimensions.

Lines

Pick a square and then step in a direction. You get a line when you keep taking that same step.

We can write a square as x and y coordinates, like (0,0). We can represent the step as a vector like [0,1] which tells us which direction we're going.

To figure out the next point in the line, we just need to add the vector to our point.

Start at (0,0), add the same step, and keep the points that stay inside the grid.

Because we are in two dimensions, there are two different directions we can step. And we can decide to step in either or both.

In three dimensions our step is of the form [0,1,1], where each of these three values can be a 1 or a 0.

Start at (0,0,0), add [0,1,1], and stop when the next point would leave the cube.

Four dimensions is the same. Our steps are defined by vectors of the form [0,1,1,0]. We just take a 4 dimensional point and add that step to it over and over, and that creates our line.

Instead of representing our hypercube as a 3d projection, we can also represent it as sixteen 4x4 grids.

Start at (0,0,0,0), add [0,1,1,0], and stop when the next point would leave the hypercube.

Infinite Foxes

Almost to the end, stay with me.

We can generalize the problem and instead of looking for foxes, we're looking for sequences of numbers.

E.g. we can replace F, O, and X with the numbers 0, 1, 2 in our 3d cube and search for the sequence 0,1,2. The problem is the same.

We're looking for a 3-length sequence in a 3-dimensional cube.

We can generalize further and look for n-length sequences in n-dimensional hypercubes. For example imagine looking for the sequence [0, 1, ..., 100] in a 100 dimensional hypercube.

What is the probability of finding a [0..n] sequence in an n-dimensional hypercube as n tends towards infinity?

On one hand, we might never expect to find such a sequence. It needs to be exactly [0, 1, 2, ..., ∞], the odds of that are 1 in ∞.

On the other hand, high dimensional spaces are exponentially more connected and there are so many more ways we can draw lines. It'd be almost guaranteed some line somewhere had that sequence.

What if somehow these two effects canceled out. What if the increased rarity of the sequence grew perfectly with the increased connectivity of higher dimensional spaces.

n = 3

In a shocking conclusion, that's exactly what happens. They grow at the exact same rate and the probability converges to a constant.

In an infinite dimensional hypercube, we have a 99.832% chance of finding at least one sequence. No matter how large our cube gets, that number stays the same.

Mathy details for those curious

Let X be the number of valid sequences in a random n-dimensional hypercube of side n. We want P(X ≥ 1).

The number of length-n oriented lines in the hypercube is (n+2)n − nn, and the chance any single line spells the target sequence in order is 1 / nn. So the expected number of sequences is:

E[X] = ((n+2)n − nn) / nn = (1 + 2/n)n − 1

As n → ∞, the term (1 + 2/n)n approaches e2, so:

E[X] → e2 − 1 ≈ 6.389

For rare, nearly-independent events like this, the count is well-approximated by a Poisson distribution with rate λ = E[X]. The chance of zero events is P(X = 0) = e−λ, so the chance of finding at least one sequence is:

P(X ≥ 1) = 1 − e−(e2 − 1) ≈ 0.99832

Conclusion

Now you're ready! I officially dub you a professional fox finder. Your hypercube is in the mail.

From an innocuous word search to a land of infinities, what a strange journey. Thanks for coming.