ratingslib.ratings.massey module
Massey Rating System
- class Massey(version=ratings.MASSEY, data_limit=0)
Bases:
RatingSystem
This method was proposed by Kenneth Massey in 1997 for ranking college football teams 1. The Massey method apart from numbers of wins and losses, it also considers the point score data to rate items via a system of linear equations. It uses a linear least squares regression to solve a system of linear equations. Note that point score data depends on the application, for instance in soccer teams the points are the number of goals of each team.
- Parameters
version (str, default=ratings.MASSEY) – A string that shows the version of rating system. The available versions can be found in
ratingslib.utils.enums.ratings
class.data_limit (int, default=0) – The parameter data_limit specifies the minimum number of observations in the dataset. Default is set
0
and indicates no limit.
- Madj
The adjusted Massey matrix. The last row of this matrix is replaced with vector of all ones.
- Type
numpy.ndarray
- d_adj
The adjusted point differentials vector. The last item of this vector is replaced zero.
- Type
numpy.ndarray
References
- 1
Massey, K. (1997). Statistical models applied to the rating of sports teams. Statistical models applied to the rating of sports teams.
Examples
The following example demonstrates Massey rating system, for the 20 first soccer matches that took place during the 2018-2019 season of English Premier League.
>>> from ratingslib.datasets.filenames import dataset_path, FILENAME_EPL_2018_2019_20_GAMES >>> from ratingslib.ratings.massey import Massey >>> filename = dataset_path(FILENAME_EPL_2018_2019_20_GAMES) >>> Massey().rate_from_file(filename) Item rating ranking 0 Arsenal 2.500000e+00 11 1 Bournemouth 4.781250e+00 3 2 Brighton -4.781250e+00 14 3 Burnley -6.031250e+00 17 4 Cardiff 3.031250e+00 9 5 Chelsea 3.250000e+00 8 6 Crystal Palace 5.031250e+00 2 7 Everton -6.281250e+00 18 8 Fulham 2.781250e+00 10 9 Huddersfield 2.220446e-15 12 10 Leicester -5.531250e+00 16 11 Liverpool 7.281250e+00 1 12 Man City 4.750000e+00 4 13 Man United -5.156250e+00 15 14 Newcastle 3.281250e+00 7 15 Southampton -6.656250e+00 19 16 Tottenham 4.531250e+00 5 17 Watford -3.406250e+00 13 18 West Ham 3.531250e+00 6 19 Wolves -6.906250e+00 20
- computation_phase()
To be overridden in subclasses.
- create_massey_matrix(data_df: DataFrame, items_df: DataFrame, columns_dict: Optional[Dict[str, Any]] = None) Tuple[ndarray, ndarray]
Construction of adjusted Massey matrix (
M_adj
) and adjusted point differential vector (d_adj
)
- preparation_phase(data_df: DataFrame, items_df: DataFrame, columns_dict: Optional[Dict[str, Any]] = None)
To be overridden in subclasses.
- rate(data_df: DataFrame, items_df: DataFrame, sort: bool = False, columns_dict: Optional[Dict[str, Any]] = None) DataFrame
This method computes ratings for a pairwise data. (e.g. soccer teams games). To be overridden in subclasses.
- Parameters
data_df (pandas.DataFrame) – The pairwise data.
items_df (pandas.DataFrame) – Set of items (e.g. teams) to be rated
sort (bool, default=True.) – If true, the output is sorted by rating value
columns_dict (Optional[Dict[str, str]]) – The column names of data file. See
ratingslib.datasets.parameters.COLUMNS_DICT
for more details.
- Returns
items_df – The set of items with their rating and ranking.
- Return type
pandas.DataFrame