ratingslib.ratings.methods module

This module gathers helper functions for ratings, including calculation of statistics, rating values normalization, outcomes counting.

calc_items_stats(data_df, items_df: DataFrame, items_dict: Optional[Dict[Any, int]] = None, normalization: bool = False, stats_columns_dict: Optional[Dict[str, Dict[Any, Any]]] = None, columns_dict: Optional[Dict[str, Any]] = None) → DataFrame

Calculation of the items’ statistics

Parameters

data_df (pandas.DataFrame) – Items data
items_df (pandas.DataFrame) – The name of items
items_dict (Optional[Dict[Any, int]], default=None) – Dictionary with teams, where the key is
normalization (bool, default=True) – If `True values are divided to the number of times an item appeared in the dataset.
stats_columns_dict (Optional[Dict[str, Dict[Any, Any]]]) –
Dictionary that maps the statistic names to column names. Below is the explanation of dictionary:
H: column name of statistic for home team e.g. in football-data.co.uk the column for goals is FTHG.

A: column name of statistic for away team e.g. in football-data.co.uk the column for goals is FTAG.

TYPE: {WIN, POINTS}
if type is WIN then compare if H > A or A < H

if type is POINTS then count points
columns_dict (Optional[Dict[str, str]]) – The column names of data file. See ratingslib.datasets.parameters.COLUMNS_DICT for more details.

Returns

items_df – DataFrame of items and their computed statistics

Return type

pandas.DataFrame

normalization_rating(items_df: DataFrame, col_name: str) → Series

Normalize rating column from all items such that minimum is 0 and maximum is 1.

Parameters

items_df (pandas.DataFrame) – The items with their names and rating values
col_name (str) – The name of column with rating values

Returns

normalized – Series with the normalized rating values

Return type

pandas.Series

count_classes_outcome_and_perc(outcomes_array: ndarray) → Tuple[OrderedDict, OrderedDict]

Count the number of outcomes and their percentages. Create an ordered dictionary with the total number of each outcome and another one dictionary with percentages.

Parameters: outcomes_array (numpy.ndarray) –
Return type: Tuple[OrderedDict, OrderedDict]

rating_systems_to_dict(rating_systems: Union[Dict[str, RatingSystem], List[RatingSystem], RatingSystem], key_based_on: Literal['params_key', 'version'] = 'params_key') → Dict[str, RatingSystem]

Create a dictionary that maps the given rating systems.

Parameters: rating_systems (Dict[str, RatingSystem] or List[RatingSystem] or RatingSystem) – A list that contains rating systems instances or a dictionary or just a RatingSystem instance. In the case of dictionary the function only validates the values.
Returns: rating_sys_dict – Dictionary of rating systems.
Return type: Dict[str, RatingSystem]

plot_ratings(data_df: DataFrame, item_i: str, item_j: str, ratings_i: str, ratings_j: str, starting_game=2, items_list: Optional[List] = None)

Plots the ratings based on games ratings data

Parameters

data_df (pd.DataFrame) –

Games dataframe (item_i vs item_j) e.g. HomeTeam AwayTeam HEloWin[HA=0_K=40_ks=400] AEloWin[HA=0_K=40_ks=400]

Wolves West Ham 1500.0000 1500.0000

item_istr: Item i column name e.g. HomeTeam
item_jstr: Item j column name e.g. AwayTeam
ratings_istr: ratings i column e.g. HEloWin[HA=0_K=40_ks=400]
ratings_jstr: ratings j column e.g. AEloWin[HA=0_K=40_ks=400]
starting_gameint, default=2: starts from the game number in the plot (axis x).
items_listOptional[List], default=None: Items to be included in the plot. If None all items will be plot.