How to win in the Cartola?

Let’s use some data and check some points and try to build a competitive team in the Cartola?

Paulo Bernardo
4 min readDec 17, 2020

Introduction

Today I will get some insights using the Cartola dataset (https://www.kaggle.com/lgmoneda/cartola-fc-brasil-scouts?select=jogadores.csv).
Cartola is a game that gets data of Brasileirao (Brazilian Soccer Championship) and the user can buy a player and score with them

When I saw my friends playing Cartola game and making decisions based on insights without statistics or data. So, I thought: “Can I make a decision in Cartola based on data?”

Another important thing is the movie Moneyball, where the manager buys players based on statistics reaching the championship final. Can we do something similar? Look, I have price, profit, and score… I will try it.

Before I start I will check some details, when you see ZAG in a graph it means defender, check the examples:
ATA = Attackers

ZAG = Defenders

TEC = Manager

LAT = Lateral

MEI = Mid

GOL = Goalkeeper

What is the best players per position?

Well, according to the data, who would be the best players by position?

Resultado:
{‘ATA’: {‘mean’: 10.73, ‘name’: ‘Rodrygo (ATA)’},

‘GOL’: {‘mean’: 10.0, ‘name’: ‘Alexander (GOL)’},

‘LAT’: {‘mean’: 8.4, ‘name’: ‘Abner Felipe (LAT)’},

‘MEI’: {‘mean’: 10.84, ‘name’: ‘Arrascaeta (MEI)’},

‘TEC’: {‘mean’: 6.4, ‘name’: ‘Andrey Lopes (TEC)’},

‘ZAG’: {‘mean’: 10.0, ‘name’: ‘Rodrigues (ZAG)’}}

Let’s look at this in a graph

Apparently the performance of the best players in their respective positions is more than 50% higher. The MEI (mid) position has the largest discrepancy in values.

What is the highest average team?

In this step we will generate the team with the highest average per position using the 4–4–2 tactic. So, for each position we will take the highest average and add to this team.

This would be the team with the highest average score, per round the user who climbed this team would score around 82.10. However, to scale this team the user would also spend about 172.13 on average.

What is the highest expected value team?

First, let’s clarify what is expected value.

The EV calc use probability, suppose that the player average profit is 20 but he has 5% of chance to profit and the average loss is 8 and his chance loss is 95%, then the calc would look like this

-6.6 = 20*0,05–8 * 0.95

In the long run, it gives you loss.

The best EV team is this:

This team would make 31.11 points per round much less than the team with the highest average, however the user would invest only 47.76

Being much cheaper than the team with the highest average and scoring more for the price paid.

A quick comparison between the EV teams with the team with the highest average on a graph

What is the best team with 100 cartoletas?

I think that I still get the best insights. We know that players have a price, so, let’s assume that we would always have 100 cartoletas (cartoletas is a money in the game) per round.

Então, teríamos o seguinte time:

‘Yony González (ATA)’, ‘Wellington Paulista (ATA)’, ‘Vanderlei Luxemburgo (TEC)’, ‘Éverson (GOL)’, ‘Fabinho (MEI)’, ‘Thiago Galhardo (MEI)’, ‘Patrick (MEI)’, ‘Felipe (MEI)’, ‘Quintero (ZAG)’, ‘Bruno Alves (ZAG)’, ‘Danilo Avelar (LAT)’, ‘Gilberto (LAT)’

This team would be able to make 50 points per round with an investment of 100 cartouches

What is the best position to invest?

First let’s look at how the players’ scores on a graph behave

Apparently if we are going to invest a lot of cards in a player the best positions would be lateral, defenders, goalkeeper, and mid. Expensive attackers do not give so much return, we will check this in a table. For this table I established an investment of 13 to 15 by the players

Looking at this data the best investment would be lateral, mid and goalkeepers.

Conclusion

The EV team for this case, although more profitable in the long run, does not pay, because we want to score more even by risking more and losing cartouches. We also saw that putting together a team with a very high average is out of scope because it requires great value for cartoletas. We also found that scaling lateral, defenders and mids is better than investing in expensive attackers.

A very important note is that these algorithms do not have a team captain whose score is always 2x.

For technical details see: https://paulo-bernardo.medium.com/lets-analyze-the-data-of-brazilian-soccer-f5030c251010

--

--

Paulo Bernardo
0 Followers

Passionate about technology, science and data. Developer at Matera