# Speed dating match statistics

In fact, the participants in the dataset attended only a single speed dating event, not multiple events, so it’s not possible to directly check whether the model would in fact predict behavior at future events based on past events.I instead a situation where participants had attended similar events in the past, by imagining that for a given date, all other dates that the pair of people had been on had occurred in a past event.But since comments focused on methodology rather than the empirical phenomena, I decided to write about methodology first, so that readers wouldn’t have to disbelief while reading my next post.This post is more dense and technical than my last one.From the point of view of discovery, this was very helpful insofar as it helped me discover the core phenomena that I used.One could argue that the filters are collectively too strict, but I’ve chosen to use them for several reasons: I’ve enumerated the criteria below.This has to do with the R(B) is generically formed using ratings that B was given after the date that A and B went on, ratings that one would not have access to in practice.The problem of understanding what’s going on is closely related to the Monte Hall problem.

One example of this is that the most popular participants are more likely than usual to have been at their best on the day of the event than the other participants are, so that confidence that one can have that someone who was chosen by most of their dates at an event will be chosen by partners at a different event is lower than the confidence that one can have that the person will be chosen by partners at the same event.

We model the surrogate of A using another participant A’ that B dated.

Conceptually, A’ is a randomly selected participant amongst the participants who B dated, but literally picking one at random would break the symmetry of the data in a way that could dilute the statistical power of the data, so I instead made a uniform choice to replace A by the participant who B would have dated that round if the speed dating schedule had been slightly different.

But when an event involves only ~15 people, the impact of a single rating on somebody’s average can be large enough so that failing to exclude the individuals’ ratings of one another would substantially overstate the predictive power of the model while simultaneously obscuring what was going on.

Given a rating type R, and two participants A and B whose decisions we’re trying to predict, let R(A, B) be the rating that A gave B, and let R(B) be the sum of the ratings that were given to B. One might think that the right features to look at are [R(B) – R(A, B)]/(N – 1) (**) But perhaps surprisingly, these features are contaminated with the decisions we’re trying to predict, to such a degree that if one didn’t notice this, one would end up with a model with far greater predictive power than one would have in practice.