ratingslib.ratings.markov module

Markov (GeM) Rating System

class Markov(*, version=ratings.MARKOV, b: float = 1, stats_markov_dict: Optional[Union[Dict[str, dict], Set[str]]] = None)

Bases: RatingSystem

This class implements the Markov (GeM - Generalized Markov Method) rating system. GeM was first used by graduate students, Angela Govan 1 and Luke Ingram 2 to successfully rank NFL football and NCAA basketball teams respectively. The Markov (GeM) method is related to the famous PageRank method 3 and it uses parts of finite Markov chains and graph theory in order to generate ratings of n objects in a finite set. Not only sports but also any problem that can be represented as a weighted directed graph can be solved using GeM model.

  • version (str, default=ratings.MARKOV) – A string that shows the version of rating system. The available versions can be found in ratingslib.utils.enums.ratings class.

  • b (float, default=1) – The damping factor. Valid numbers are in the range [0,1]

  • stats_markov_dict (Optional[Dict[str, Dict[Any, Any]]], default=None) –

    A dictionary containing statistics details for the method. For instance for soccer teams rating, the following dictionary stats_markov_dict:

    stats_markov_dict = {
    'TotalWins': {'VOTE': 10, 'ITEM_I': 'FTHG', 'ITEM_J': 'FTAG',
                   'METHOD': 'VotingWithLosses'},
    'TotalGoals': {'VOTE': 10, 'ITEM_I': 'FTHG', 'ITEM_J': 'FTAG',
                    'METHOD': 'WinnersAndLosersVotePoint'}

    specifies the following details:

    • TotalGoals and TotalWins are the names of two statistics

    • 'VOTE' : 10 means that the vote is 10. Those votes will be converted as weights. The statistics in this example are equally weighted

    • 'ITEM_I': 'FTHG' and 'ITEM_J': 'FTAG' are the column names for home and away team respectively

    • The key 'METHOD' specifies which method constructs the voting matrix. The available methods are:

      1. 'VotingWithLosses' when the losing team casts a number of votes equal to the margin of victory in its matchup with a stronger opponent.

      2. 'WinnersAndLosersVotePoint' when both the winning and losing teams vote with the number of points given up.

      3. 'LosersVotePointDiff' when the losing team cast a number of votes

    See also the implementation of the method



Dictionary that maps voting and stochastic arrays. The keys that starts with V map the voting matrices and with S map the stochastic matrices


Dict[str, np.ndarray]


Dictionary that maps parameters to their values.


Dict[str, Optional[Dict[str, Dict[Any, Any]]]]


A Stochastic Markov matrix is a square matrix where each entry describes the probability that the item will vote for the respective item.




A Stochastic Markov matrix that is irreducible




The stationary vector or dominant eigenvector of the stochastic_matrix.




Set of statistics names




ValueError – Value of b ∈ [0, 1]


The following example demonstrate GeM rating system, for the 20 first soccer matches that took place during the 2018-2019 season of English Premier League.

>>> from ratingslib.datasets.filenames import dataset_path, FILENAME_EPL_2018_2019_20_GAMES
>>> from ratingslib.ratings.markov import Markov
>>> filename = dataset_path(FILENAME_EPL_2018_2019_20_GAMES)
>>> votes = {
        'TW': {
            'VOTE': 10,
            'ITEM_I': 'FTHG',
            'ITEM_J': 'FTAG',
            'METHOD': 'VotingWithLosses'},
        'TG': {
            'VOTE': 10,
            'ITEM_I': 'FTHG',
            'ITEM_J': 'FTAG',
            'METHOD': 'WinnersAndLosersVotePoint'},
        'TST': {
            'VOTE': 10,
            'ITEM_I': 'HST',
            'ITEM_J': 'AST',
            'METHOD': 'WinnersAndLosersVotePoint'},
        'TS': {
            'VOTE': 10,
            'ITEM_I': 'HS',
            'ITEM_J': 'AS',
            'METHOD': 'WinnersAndLosersVotePoint'},
>>> Markov(b=0.85, stats_markov_dict=votes).rate_from_file(filename)
                Item    rating  ranking
    0          Arsenal  0.050470       11
    1      Bournemouth  0.039076       15
    2         Brighton  0.051460       10
    3          Burnley  0.071596        2
    4          Cardiff  0.024085       20
    5          Chelsea  0.045033       13
    6   Crystal Palace  0.037678       16
    7          Everton  0.066307        3
    8           Fulham  0.036356       17
    9     Huddersfield  0.032164       19
    10       Leicester  0.055491        7
    11       Liverpool  0.056879        6
    12        Man City  0.048325       12
    13      Man United  0.061052        4
    14       Newcastle  0.035814       18
    15     Southampton  0.051716        9
    16       Tottenham  0.053079        8
    17         Watford  0.082788        1
    18        West Ham  0.041824       14
    19          Wolves  0.058807        5



Govan, A. Y. (2008). Ranking Theory with Application to Popular Sports. Ph.D. dissertation, North Carolina State University.


Ingram, L. C. (2007). Ranking NCAA sports teams with Linear algebra. Ranking NCAA sports teams with Linear algebra. Charleston


Sergey Brin and Lawrence Page. The Anatomy of a Large-Scale Hypertextual Web Search Engine. Computer Networks and ISDN Systems, 33:107-17, 1998.


Set the group of statistics.

static do_stochastic(voting_matrix: ndarray)

Normalize the rows of the voting matrix to develop a stochastic transition probability matrix.


voting_matrix (List[list]) –


stochastic_matrix – Stochastic matrix built from the corresponding voting

Return type


static compute(stochastic_matrix, b)

Compute the stationary vector or dominant eigenvector of the transpose of irreducible matrix. Stationary vector is the rating vector. Note: irreducible matrix is the stochastic_matrix_asch and stationary vector is the pi_steady.

create_voting_matrix(*, voting_method: Literal['VotingWithLosses', 'WinnersAndLosersVotePoint', 'LosersVotePointDiff'], data_df: DataFrame, items_df: DataFrame, col_name_home: str, col_name_away: str, columns_dict: Optional[Dict[str, Any]] = None) ndarray

Selection of method for developing voting matrix. The available methods are:

  1. 'VotingWithLosses' when the losing team casts a number of votes equal to the margin of victory in its matchup with a stronger opponent.

  2. 'WinnersAndLosersVotePoint' when both the winning and losing teams vote with the number of points given up.

  3. 'LosersVotePointDiff' when the losing team cast a number of votes.

preparation_phase(data_df: DataFrame, items_df: DataFrame, columns_dict: Optional[Dict[str, Any]] = None)

During preparation phase, voting and stochastic matrices are constructed for each statistic according to the method specified in the dictionary of attr:stats_markov_dict.

rate(data_df: DataFrame, items_df: DataFrame, sort: bool = False, columns_dict: Optional[Dict[str, Any]] = None) DataFrame

This method computes ratings for a pairwise data. (e.g. soccer teams games). To be overridden in subclasses.

  • data_df (pandas.DataFrame) – The pairwise data.

  • items_df (pandas.DataFrame) – Set of items (e.g. teams) to be rated

  • sort (bool, default=True.) – If true, the output is sorted by rating value

  • columns_dict (Optional[Dict[str, str]]) – The column names of data file. See ratingslib.datasets.parameters.COLUMNS_DICT for more details.


items_df – The set of items with their rating and ranking.

Return type


static validate_stats_markov_dict(stats_markov_dict: dict)