Predicting Survivor Outcomes (Spoiler-Free)

Oliver Fitzpatrick
Jun 9, 2020
8 min read

In this sportless world, I (like many others I am sure) have tried to find the next best thing to live sport. For me, this has been watching the reality-game show Survivor. For those not familiar with the TV show, it basically involves a group of people who don't know each other taken to a remote location to live and survive as a tribe. Each episode culminates in a player being voted off by other members of the tribe. Players can also make themselves immune by winning immunity challenges or finding and playing hidden immunity idols. The final winner is then chosen by players who have previously been voted off.

The aim of this project is to build a model to predict the various outcomes for each Survivor episode. There are several outcomes that can be predicted with some correlation between each:

Winning Immunity
Being voted off (booted)
Winning the entire game

I decided to base the models purely on the definite facts of what happened on the island - not what was necessarily shown as TV. This is to avoid subjective data - to avoid judging a player's performance based on the 'edit' they receive. I am sure there is a lot of predictive power in judging a player on their 'edit' but I was more interested in the actual gameplay rather than simply trying to predict the winner based on how they are portrayed on TV. The fact that each 45 minute episode comprises of 3 actual days of footage, means that the producers could (intentionally or not) make the footage portray a player in a certain way that suits the narrative they want to tell, even if that is not necessarily the most accurate way of portraying the player. This is the main issue with analysing Survivor using what is shown on TV, it is built purely to entertain, so showing each player evenly and objectively would likely result in a painfully boring program. Hence, I decided that only the absolute facts that could not be edited or changed were to be used for the model - this left me with:

data from tribal councils (i.e. how each player voted)
data from challenges (i.e. who won tribal or individual challenges)
basic demographic information (age, gender, etc).

The main objective of this project was to build models that could predict the various outcomes and could be used companion when watching each episode (much like live sport having expert tips or model predictions used as part of the coverage). These models could then be dissected to find which aspects of gameplay were most important to avoid being booted and to winning the game.

The Data

Building predictive models requires good data, and this project would not have been possible without the amazing site "The True Dork Times" - they have produced data from every episode which is found here: https://www.truedorktimes.com/survivor/boxscores/data.htm. Forty seasons of US Survivor have been played and data was available for the first 39. The data was randomly split into a training group consisting of 70% of the whole data, this would be used to train the models, and the remaining data would be kept aside to test and validate the model.

Building The Models

A separate model was built to predict immunity, being booted, and winning the game. Initially I had thought building the immunity and boot models together would make more sense because they are obviously connected (you can't get booted if you win immunity). However, this method did not improve the accuracy for predicting who would be booted. I think this is down to the counter-intuitive fact that the model found that a player who is more likely to win immunity is actually more likely to be booted as well. This is probably because other players in the game see the immunity-winning player as a threat and so want to vote them out to have more chance at winning immunity themselves.

The models were to be assessed not only on the accuracy of their predictions (via log loss), but also on their consistency of accuracy across the training and test data sets. This meant that although a random forest model was found to be the most accurate for the test data, it was not chosen because it was frighteningly accurate across the training data. This model might be fine to use for predicting future seasons, but considering the main purpose of the model is to use as a guide while watching and to learn which factors are important to being booted, it was not used. Having such a disparity across the test and training data showed that this model was overfitting and would ruin many episodes and seasons due to unrealistically accurate predictions that are not possible in a new data environment (as shown by the test data results). So, considering the dual criteria on model selection, a k-nearest neighbour model was found to be the best for all three outcomes. I will spare the intricate details of the model, but will have a look at some of the interesting aspects of the model.

Some Interesting Findings

To look at the effects of variables on each prediction, we will use absolute difference from average. This takes out the effect of just being further in the game - obviously when there are 5 players left a player has a much higher chance of winning and being booted than when there are 16 players. Comparing how far a player is above or below the average mark (1/number of players) gives a more interesting reflection on what the model thinks than simply saying they are at a higher probability because they are later in the game. Additionally, I found that combining variables for visualising and understanding player types made a lot more sense than using each individual variable. So I produced ratings for:

"Controlling The Vote" (voting correctly for the person booted)
"Staying Out Of Trouble" (not receiving votes)
"Tribe Strength" (being on a tribe that wins challenges)
"Winning Challenges" (winning individual challenges)
"Finding Idols" (finding hidden immnuity idols).

These ratings combine a number of variables and are built as z-scores (with a mean of zero and standard deviation of 1), this makes it easy to analyse the effect for players who are better than average and worse than average. By no means do these ratings explain all of the models' predictions but they are an easier, more concise way of understanding various aspects of the game. I will do a future post explaining more about the ratings and assessing types of players.

I won't go through all of the other variables the models were built on now (that may require another blog post), but I will take a few of the ones that I found interesting.

Firstly, lets look at the immunity challenge effect that I found earlier.

Winning Individual Challenges

Whilst the effect is small, it is clear that being better at winning challenges makes a player more likely to be booted and less likely to win. The effect of being bad at challenges is more marginal again, but appears to show a slight decrease in both being booted (i.e. not seen as a threat), but also a decrease in winning the game.

Controlling The Vote

One of the more obvious observations from the model is that controlling the vote clearly has a strong negative correlation with likelihood of being booted. If you control the vote you have less chance of being booted and vice-versa. The interesting aspect of this rating is the apparent lack of correlation between controlling the vote and win probability increase. This might be explained by the jury not wanting to give $1,000,000 to the player who was responsible for voting them out of the game.

Staying Out Of Trouble

Unsurprising again is the strong effect between staying out of trouble and staying in the game - the rating is practically defined by avoiding the boot. This correlation shows that avoiding votes in the past makes a player more likely to avoid votes in the future. However, the effect on staying out of trouble and winning is very different to controlling the vote. The below zero ratings are understandable - the more votes received the less likely you are the win the game. The above zero ratings are interesting, with players who are above average but nothing special (i.e. may have received a couple of votes but not many) don't get a boost to their win probability. However, players who are very good at avoiding votes become much more likely to win - again, this may be explained by the jury who are unlikely to have ever wanted these players out of the game, it could also be explained by players who fly under the radar and make it to the final without attracting attention.

Finding Idols

Players receive a negative rating if they do not find an idol in a season involving idols (more negative if there are more idols in play and more episodes lasted without finding one). Players who find idols are more likely to win than those who don't, perhaps because they are more able to control the vote or make moves knowing they have a safety net. They are (unsurprisingly) also less likely to be booted, this shows the power of finding an immunity idol is not negligible.

Tribe Strength

The strength of the tribe a player is on does not appear to impact their chances of being booted. However, a strong tribe does seem to increase the chances of winning, which makes sense because it increases a player's chances of making it further in the game.

Appearances Relative To Other Players

Players who compete against more experienced Survivor players appear to have much less chance of winning and a much higher chance of being booted (note: rank of 1 means player has less previous appearances on Survivor than all other players in the game). The change is stark, if a player is less experienced than half the other players in the game then they are at a disadvantage. This is probably due to the fact that players that are chosen to return are likely to be good at the game compared to an average player.

Age And Gender

The model sees men as overall more likely to win than women (in reality men have won 24 of the 39 seasons in the data). The peak improvement for men is seen for 30 year olds, who have a greater chance of winning than other demographics. Both women and men see a decrease in their win probability as age increases past 40.

Gender Balance On Tribe

This is another variable that stood out when I was looking at the models. It appears that both men and women have an increased chance of winning when their tribe has more males than females. The effect becomes even more clear when data from only before the merge is used.

If women are on a tribe of mostly men, then they become much more vulnerable to be booted. However, the converse isn't true - if the tribe is mostly women then neither men nor women become more vulnerable or safe.

Is There A Foolproof Strategy To Win?

If nothing else, this analysis shows that there are many factors that impact on a player's chances of winning Survivor. There are many counter-intuitive factors, that might be bad strategies in the long run, such as winning challenges, or women voting off another woman before the merge. Obviously, these findings are only marginal and will not apply to every game of Survivor, but they show that there are some trends that appear to give slight advantages or disadvantages - some of which are manageable and some of which are not (like age or gender). The analysis also fails to completely capture a huge part of the game - the 'social' game, how people get along living together and surviving a deserted island, although, part of this may be captured by players attracting votes because of their camp life rather than because of strategy.

See What The Model Thinks For Each Episode

I have made a dashboard to visualise the models' findings and the player ratings. I have also used clustering to determine player types but that will have to wait to be explained in another post.

This can be found here: https://ocfitz1.shinyapps.io/SurvivorEpisodePredictions/.

I have found that having the predictions at hand whilst watching an episode increases my enjoyment, in the same way that knowing the relative strengths of teams or players in a sporting contest increases the context of the match. There is some added value given to the blindside when even the model doesn't see it coming, or sometimes a player seems certain to be sent home but then wins immunity, adding more significance to that win.

If you enjoyed this article or using the dashboard and have a question for the model or other analyses about Survivor, please just send me a message.

Sports SuperModels