Birthday Paradox Simulation

Exploration of probability principles through simulation of a counterintuitive phenomenon

Proem

In your class or any small group, have there ever been a case where two people had the same birthday? Recall how big the group was? In fact, most groups where you found out and realize having at least one duplicate birthday are large enough that there's more than just one duplicate birthday occurring. So how small that value actually is?

Where we headin', Captain?

Considering the world without leap years. How many people exactly we need in a room to have so that there is 50% probability there are at least 2 people (1 pair) having the same birthday?

Give it a shot:

Try some number to see if there is actually at least 1 pair of people having the same birthday.

What does the previous 50% mean here? If you try 10 times, theoretically there should be 5 times you witness having at least 1 pair of people having the same birthday; If you try 100 times, that should be 50, etc. Though not to be exactly, the number you get should be not too far away around these values.

Revelation from simulations:

Remember law of large number? When the sample size is large enough, empirical probability, or frequency will approach probability!

In the simulation below, you can choose your own sample size (Num of trials) to see starting from which number of people, probability of having at least 1 pair of people having the same birthday begin to exceed 50%. You may hover or touch the dots to see its more precise value.

Increase sample size to get more accurate result and hence a smoother plot!

The math behind

Think of the event we care:

Two people having the same birthday

The probability of this seems not intuitive to calculate. Fortunately, by law of probability, we know that the probability of some event is just 1 minus the probability of its complementary event. .. So , what shall be the complementary event of "Two people having the same birthday"?

Not a single pair of people having the same birthday

Transfer that into math notation:

$$B_i \neq B_j ,\forall i \neq j \in \mathcal{P}, $$ and

$$ (\exists i \neq j: B_i = B_j) = (B_i \neq B_j ,\forall i \neq j)^C , \forall i \neq j \in \mathcal{P} \text{ }$$

Starting from an basic example, if there is only 2 people, the probability of not having a single pair of people with the same birthday is strightforward: Fix the first person's birthday to be \(B_1\), there remains 364 options to avoid shared birthday with the first person, i.e. $$\mathbb{P}=1\times \frac{365-1}{365}$$ When you add more people, remember the new people need to have birthday that is different than any of the people before. Do you already feel something? Based on the speculation, the probability of \(n\) people all having different birthday is, hold tight: $$1- 1\times \prod_{n=1}^{365} \frac{365-n}{365}$$

Construction afoot...

.. And relationship with hash collision

Construction afoot...