ratingslib.ratings.methods module

This module gathers helper functions for ratings, including calculation of statistics, rating values normalization, outcomes counting.

calc_items_stats(data_df, items_df: DataFrame, items_dict: Optional[Dict[Any, int]] = None, normalization: bool = False, stats_columns_dict: Optional[Dict[str, Dict[Any, Any]]] = None, columns_dict: Optional[Dict[str, Any]] = None) DataFrame

Calculation of the items’ statistics

Parameters
  • data_df (pandas.DataFrame) – Items data

  • items_df (pandas.DataFrame) – The name of items

  • items_dict (Optional[Dict[Any, int]], default=None) – Dictionary with teams, where the key is

  • normalization (bool, default=True) – If `True values are divided to the number of times an item appeared in the dataset.

  • stats_columns_dict (Optional[Dict[str, Dict[Any, Any]]]) –

    Dictionary that maps the statistic names to column names. Below is the explanation of dictionary:

    H: column name of statistic for home team e.g. in football-data.co.uk the column for goals is FTHG.

    A: column name of statistic for away team e.g. in football-data.co.uk the column for goals is FTAG.

    TYPE: {WIN, POINTS}

    1. if type is WIN then compare if H > A or A < H

    2. if type is POINTS then count points

  • columns_dict (Optional[Dict[str, str]]) – The column names of data file. See ratingslib.datasets.parameters.COLUMNS_DICT for more details.

Returns

items_df – DataFrame of items and their computed statistics

Return type

pandas.DataFrame

normalization_rating(items_df: DataFrame, col_name: str) Series

Normalize rating column from all items such that minimum is 0 and maximum is 1.

Parameters
  • items_df (pandas.DataFrame) – The items with their names and rating values

  • col_name (str) – The name of column with rating values

Returns

normalized – Series with the normalized rating values

Return type

pandas.Series

count_classes_outcome_and_perc(outcomes_array: ndarray) Tuple[OrderedDict, OrderedDict]

Count the number of outcomes and their percentages. Create an ordered dictionary with the total number of each outcome and another one dictionary with percentages.

Parameters

outcomes_array (numpy.ndarray) –

Return type

Tuple[OrderedDict, OrderedDict]

rating_systems_to_dict(rating_systems: Union[Dict[str, RatingSystem], List[RatingSystem], RatingSystem], key_based_on: Literal['params_key', 'version'] = 'params_key') Dict[str, RatingSystem]

Create a dictionary that maps the given rating systems.

Parameters

rating_systems (Dict[str, RatingSystem] or List[RatingSystem] or RatingSystem) – A list that contains rating systems instances or a dictionary or just a RatingSystem instance. In the case of dictionary the function only validates the values.

Returns

rating_sys_dict – Dictionary of rating systems.

Return type

Dict[str, RatingSystem]

plot_ratings(data_df: DataFrame, item_i: str, item_j: str, ratings_i: str, ratings_j: str, starting_game=2, items_list: Optional[List] = None)

Plots the ratings based on games ratings data

Parameters

data_df (pd.DataFrame) –

Games dataframe (item_i vs item_j) e.g. HomeTeam AwayTeam HEloWin[HA=0_K=40_ks=400] AEloWin[HA=0_K=40_ks=400]

Wolves West Ham 1500.0000 1500.0000

item_istr

Item i column name e.g. HomeTeam

item_jstr

Item j column name e.g. AwayTeam

ratings_istr

ratings i column e.g. HEloWin[HA=0_K=40_ks=400]

ratings_jstr

ratings j column e.g. AEloWin[HA=0_K=40_ks=400]

starting_gameint, default=2

starts from the game number in the plot (axis x).

items_listOptional[List], default=None

Items to be included in the plot. If None all items will be plot.