How I created an Expected points table for Indian Super League using Python

4 min readMar 27, 2022

Source: https://www.indiansuperleague.com/press-releases/hero-indian-super-league-trophy-unveiled

In my previous article, I covered how I scraped stats such as xG, xG open play from fotmob.com. I’ll be using the data collected from that for making an expected points table.

Expected points signify how many points a team should have been expected to get based on the chances they created. The value ranges from 0 to 3 points. For a team to get all 3 xPoints they must not allow the opposite team to have even one single shot and thus generate an xG of 0. Such scenarios are very rare and hardly ever happen therefore dominant teams get an xP of around 2.1 to 2.7.

How to calculate expected points

Expected points are calculated on the basis of expected goals which is a metric that quantifies the quality of a shot. For instance, a penalty has an xG of 0.75 approx which is like saying a 75% chance of scoring from that spot.

For calculating expected points, you’ll need all the shots that were taken and their respective xG values. Then the whole game is simulated say about 10,000 times and on each simulation, the results are recorded based on the goals scored. The goals scored are estimated using the xG values, each shot compared against a random value that lies between 0 to 1.

Suppose a shot has an xG of 0.24, a random number is generated and the odds are that it will be a number below 0.24, 24% of the time, and above 0.24 76% of the time. If the randomly generated number was less than the xG value then the shot can be considered as a goal scored. Else if it is higher then it is not scored. In this example, the shot with 0.24 xG will have a random number generated that is lower than it 24% of the time, which accurately represents this shots’ xG.

Although this way of calculating expected points is the norm, we could also use the aggregated xG values to find the expected points. Since I did not have the xG value of each shot taken, I used the table below for my algorithm.

Source: https://theshortfuse.sbnation.com/2017/11/15/16655916/how-to-calculate-xpoints-analysis-stats-xg

Calculating expected points with this method is quite simple and the results are fairly accurate. Let’s dive into the code.

From the scraped data, I had the stats in the per match format, these were the columns of the data frame with the scraped info.

 1   match_id             110 non-null    int64  
 2   home_team            110 non-null    object 
 3   away_team            110 non-null    object 
 4   home_team_score      110 non-null    float64
 5   away_team_score      110 non-null    float64
 6   home_xG              110 non-null    float64
 7   away_xG              110 non-null    float64
 8   home_shots           110 non-null    float64
 9   away_shots           110 non-null    float64
 10  home_xG_first_half   110 non-null    float64
 11  away_xG_first_half   110 non-null    float64
 12  home_xG_second_half  110 non-null    float64
 13  away_xG_second_half  110 non-null    float64
 14  home_xG_Open_Play    110 non-null    float64
 15  away_xG_Open_Play    110 non-null    float64
 16  home_xG_Set_Play     110 non-null    float64
 17  away_xG_Set_Play     110 non-null    float64
 18  home_xGOT            110 non-null    float64
 19  away_xGOT            110 non-null    float64

Create an xG difference column

df['xG_differential'] = df['home_xG'] - df['away_xG']

2. Allotting expected points based on xG difference

for idx, row in df.iterrows():
    if df.loc[idx,'xG_differential']>1.5:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 2.7,0.3
    elif df.loc[idx,'xG_differential']>1.0 and 
    df.loc[idx,'xG_differential']<1.5:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 2.3,0.7
    
    elif df.loc[idx,'xG_differential']>0.5 and 
    df.loc[idx,'xG_differential']<1.0:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 2.0,1.0
    
    elif df.loc[idx,'xG_differential']>0 and 
    df.loc[idx,'xG_differential']<0.5:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 1.5,1.5
    
    elif df.loc[idx,'xG_differential']>-0.5 and 
    df.loc[idx,'xG_differential']<0:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 0.7,2.3
    
    elif df.loc[idx,'xG_differential']>-1.0 and 
    df.loc[idx,'xG_differential']<-0.5:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 0.5,2.5
        
    elif df.loc[idx,'xG_differential']>-1.5 and 
    df.loc[idx,'xG_differential']<-1.0:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 0.3,2.7
    
    elif df.loc[idx,'xG_differential']>-1.5:
        df.loc[idx,'home_xP'],df.loc[idx,'away_xP'] = 0.1,2.9

3. Splitting the dataframe — Home and Away matches:

df_home = df.groupby(df['home_team'])
df_away = df.groupby(df['away_team'])

4. Aggregating the expected points:

team_xPoints = df_home['home_xP'].sum() + df_away['away_xP'].sum()
team_xPoints = team_xPoints.reset_index() #converts to a df
team_xPoints.rename( columns={'home_team':'team',0 :'xPoints'}, inplace=True )
team_xPoints = team_xPoints.sort_values(['team'])
team_xPoints

Conclusion

And that’s how I calculated the expected points for Indian Super League teams. Jamshedpur FC were deserving winners of the ISL shield since they had the highest expected points and the highest actual points.

Using this table you can judge how a team actually performed, for instance, FC Goa finished 9th in the actual table even after creating more chances for the opposition on various occasions. Sometimes you do need a little bit of luck to get through.

Thank you for reading, do check out my other articles on football analytics. Any kind of suggestions or comments would be appreciated.

How I created an Expected points table for Indian Super League using Python

How to calculate expected points

Conclusion

Written by Joyan Bhathena

Responses (1)