ratingslib.ratings.colley module

Colley rating system

class Colley(version=ratings.COLLEY)

Bases: RatingSystem

This class implements the Colley rating system. This system was proposed by astrophysicist Dr. Wesley Colley in 2001 for ranking sports teams. Colley’s method 1 makes use of an idea from probability theory, known as Laplace’s ‘‘rule of succession’’. In fact, it is a modified form of the win-loss method, which uses the percentage of wins of each team.

Parameters

version (str, default=ratings.COLLEY) – A string that shows the version of rating system. The available versions can be found in ratingslib.utils.enums.ratings class.

C

The Colley matrix of shape (n,n) where n = the total number of items.

Type

numpy.ndarray

b

The right-hand side vector b of shape (n,) where n = the total number of items.

Type

numpy.ndarray

References

1

Colley, W. (2002). Colley’s bias free college football ranking method: The Colley Matrix Explained.

Examples

The following example demonstrates Colley rating system, for the 20 first soccer matches that took place during the 2018-2019 season of English Premier League.

>>> from ratingslib.datasets.filenames import dataset_path, FILENAME_EPL_2018_2019_20_GAMES
>>> from ratingslib.ratings.colley import Colley
>>> filename = dataset_path(FILENAME_EPL_2018_2019_20_GAMES)
>>> Colley().rate_from_file(filename)
              Item    rating  ranking
0          Arsenal  0.333333       16
1      Bournemouth  0.686012        3
2         Brighton  0.562500        6
3          Burnley  0.401786       10
4          Cardiff  0.394345       11
5          Chelsea  0.666667        5
6   Crystal Palace  0.501488        8
7          Everton  0.562500        6
8           Fulham  0.293155       17
9     Huddersfield  0.333333       16
10       Leicester  0.473214        9
11       Liverpool  0.712798        2
12        Man City  0.666667        5
13      Man United  0.508929        7
14       Newcastle  0.391369       12
15     Southampton  0.366071       14
16       Tottenham  0.671131        4
17         Watford  0.741071        1
18        West Ham  0.349702       15
19          Wolves  0.383929       13
computation_phase()

Solve the system Cr=b to obtain the Colley rating vector r.

create_colley_matrix(data_df: DataFrame, items_df: DataFrame, columns_dict: Optional[Dict[str, Any]] = None) Tuple[ndarray, ndarray]

Construction of Colley coefficient matrix C and right-hand side vector b.

preparation_phase(data_df: DataFrame, items_df: DataFrame, columns_dict: Optional[Dict[str, Any]] = None)

To be overridden in subclasses.

rate(data_df: DataFrame, items_df: DataFrame, sort: bool = False, columns_dict: Optional[Dict[str, Any]] = None) DataFrame

This method computes ratings for a pairwise data. (e.g. soccer teams games). To be overridden in subclasses.

Parameters
  • data_df (pandas.DataFrame) – The pairwise data.

  • items_df (pandas.DataFrame) – Set of items (e.g. teams) to be rated

  • sort (bool, default=True.) – If true, the output is sorted by rating value

  • columns_dict (Optional[Dict[str, str]]) – The column names of data file. See ratingslib.datasets.parameters.COLUMNS_DICT for more details.

Returns

items_df – The set of items with their rating and ranking.

Return type

pandas.DataFrame