ratingslib.ratings.massey module

Massey Rating System

class Massey(version=ratings.MASSEY, data_limit=0)

Bases: RatingSystem

This method was proposed by Kenneth Massey in 1997 for ranking college football teams 1. The Massey method apart from numbers of wins and losses, it also considers the point score data to rate items via a system of linear equations. It uses a linear least squares regression to solve a system of linear equations. Note that point score data depends on the application, for instance in soccer teams the points are the number of goals of each team.

Parameters
  • version (str, default=ratings.MASSEY) – A string that shows the version of rating system. The available versions can be found in ratingslib.utils.enums.ratings class.

  • data_limit (int, default=0) – The parameter data_limit specifies the minimum number of observations in the dataset. Default is set 0 and indicates no limit.

Madj

The adjusted Massey matrix. The last row of this matrix is replaced with vector of all ones.

Type

numpy.ndarray

d_adj

The adjusted point differentials vector. The last item of this vector is replaced zero.

Type

numpy.ndarray

References

1

Massey, K. (1997). Statistical models applied to the rating of sports teams. Statistical models applied to the rating of sports teams.

Examples

The following example demonstrates Massey rating system, for the 20 first soccer matches that took place during the 2018-2019 season of English Premier League.

>>> from ratingslib.datasets.filenames import dataset_path, FILENAME_EPL_2018_2019_20_GAMES
>>> from ratingslib.ratings.massey import Massey
>>> filename = dataset_path(FILENAME_EPL_2018_2019_20_GAMES)
>>> Massey().rate_from_file(filename)
              Item        rating  ranking
0          Arsenal  2.500000e+00       11
1      Bournemouth  4.781250e+00        3
2         Brighton -4.781250e+00       14
3          Burnley -6.031250e+00       17
4          Cardiff  3.031250e+00        9
5          Chelsea  3.250000e+00        8
6   Crystal Palace  5.031250e+00        2
7          Everton -6.281250e+00       18
8           Fulham  2.781250e+00       10
9     Huddersfield  2.220446e-15       12
10       Leicester -5.531250e+00       16
11       Liverpool  7.281250e+00        1
12        Man City  4.750000e+00        4
13      Man United -5.156250e+00       15
14       Newcastle  3.281250e+00        7
15     Southampton -6.656250e+00       19
16       Tottenham  4.531250e+00        5
17         Watford -3.406250e+00       13
18        West Ham  3.531250e+00        6
19          Wolves -6.906250e+00       20
computation_phase()

To be overridden in subclasses.

create_massey_matrix(data_df: DataFrame, items_df: DataFrame, columns_dict: Optional[Dict[str, Any]] = None) Tuple[ndarray, ndarray]

Construction of adjusted Massey matrix (M_adj) and adjusted point differential vector (d_adj)

preparation_phase(data_df: DataFrame, items_df: DataFrame, columns_dict: Optional[Dict[str, Any]] = None)

To be overridden in subclasses.

rate(data_df: DataFrame, items_df: DataFrame, sort: bool = False, columns_dict: Optional[Dict[str, Any]] = None) DataFrame

This method computes ratings for a pairwise data. (e.g. soccer teams games). To be overridden in subclasses.

Parameters
  • data_df (pandas.DataFrame) – The pairwise data.

  • items_df (pandas.DataFrame) – Set of items (e.g. teams) to be rated

  • sort (bool, default=True.) – If true, the output is sorted by rating value

  • columns_dict (Optional[Dict[str, str]]) – The column names of data file. See ratingslib.datasets.parameters.COLUMNS_DICT for more details.

Returns

items_df – The set of items with their rating and ranking.

Return type

pandas.DataFrame