Serasa Experian’s DataLab uses machine learning to predict that Brazil has a 20.9% chance of winning the FIFA World Cup this year
Hyderabad, 25 November 2022: Brazil has a 53.4% chance of reaching the semifinals and a 20.9% chance of winning this year’s FIFA World Cup, according to DataLab, Serasa Experian’s innovation laboratory. DataLab’s data scientists used machine learning techniques to predict the results of the World Cup qualifiers and winners.
The graph below indicates the probability of the countries that are most likely to win the 2022 FIFA World Cup, based on the countries that won the last ten World Cups:
Brazil is most likely to win the World Cup at 20.9%
Argentina’s probability is 14.3%
France’s probability is 11.4%
Spain’s probability is 9%
Germany’s probability is 3.4%
A machine learning model was created based on data from the last three football World Cup cycles. Using similar algorithms used by streaming platforms to recommend shows to users, DataLab set up a machine learning system that predicts the probability of the result of each match between two countries.
Teams most likely to qualify for the knockout stage
Considering the groups of these five major teams, the data also predicts the chances of qualifying for that stage to enter the knockout stage:
Brazil has the highest probability at 97.48% (Group G)
Argentina has a 96.1% probability (Group C)
France has a 93.4% probability (Group D)
Spain has an 89.6% probability, while Germany has a 69.6% probability (both are in Group E)
A data set containing all the results of the last three World Cup cycles was used: countries involved in the match, dates, type of competition (friendly, qualifiers, regional, and world cups, etc.), and the result.
The information was processed with recommendation systems and deep learning (matrix factoring and autoencoders) using a similarity metric that indicates how much two teams are similar based on the history of match results. We incorporated this data representation and used a boosting model to train a machine learning system to be able to predict the probability of the outcome of each match considering the current scenario, that is, the period before the championship.