ratingslib.ratings.methods module
This module gathers helper functions for ratings, including calculation of statistics, rating values normalization, outcomes counting.
- calc_items_stats(data_df, items_df: DataFrame, items_dict: Optional[Dict[Any, int]] = None, normalization: bool = False, stats_columns_dict: Optional[Dict[str, Dict[Any, Any]]] = None, columns_dict: Optional[Dict[str, Any]] = None) DataFrame
Calculation of the items’ statistics
- Parameters
data_df (pandas.DataFrame) – Items data
items_df (pandas.DataFrame) – The name of items
items_dict (Optional[Dict[Any, int]], default=None) – Dictionary with teams, where the key is
normalization (bool, default=True) – If
`True
values are divided to the number of times an item appeared in the dataset.stats_columns_dict (Optional[Dict[str, Dict[Any, Any]]]) –
Dictionary that maps the statistic names to column names. Below is the explanation of dictionary:
H: column name of statistic for home team e.g. in football-data.co.uk the column for goals is FTHG.
A: column name of statistic for away team e.g. in football-data.co.uk the column for goals is FTAG.
TYPE: {WIN, POINTS}
if type is WIN then compare if H > A or A < H
if type is POINTS then count points
columns_dict (Optional[Dict[str, str]]) – The column names of data file. See
ratingslib.datasets.parameters.COLUMNS_DICT
for more details.
- Returns
items_df – DataFrame of items and their computed statistics
- Return type
pandas.DataFrame
- normalization_rating(items_df: DataFrame, col_name: str) Series
Normalize rating column from all items such that minimum is 0 and maximum is 1.
- Parameters
items_df (pandas.DataFrame) – The items with their names and rating values
col_name (str) – The name of column with rating values
- Returns
normalized – Series with the normalized rating values
- Return type
pandas.Series
- count_classes_outcome_and_perc(outcomes_array: ndarray) Tuple[OrderedDict, OrderedDict]
Count the number of outcomes and their percentages. Create an ordered dictionary with the total number of each outcome and another one dictionary with percentages.
- Parameters
outcomes_array (numpy.ndarray) –
- Return type
Tuple[OrderedDict, OrderedDict]
- rating_systems_to_dict(rating_systems: Union[Dict[str, RatingSystem], List[RatingSystem], RatingSystem], key_based_on: Literal['params_key', 'version'] = 'params_key') Dict[str, RatingSystem]
Create a dictionary that maps the given rating systems.
- Parameters
rating_systems (Dict[str, RatingSystem] or List[RatingSystem] or RatingSystem) – A list that contains rating systems instances or a dictionary or just a RatingSystem instance. In the case of dictionary the function only validates the values.
- Returns
rating_sys_dict – Dictionary of rating systems.
- Return type
Dict[str, RatingSystem]
- plot_ratings(data_df: DataFrame, item_i: str, item_j: str, ratings_i: str, ratings_j: str, starting_game=2, items_list: Optional[List] = None)
Plots the ratings based on games ratings data
- Parameters
data_df (pd.DataFrame) –
Games dataframe (item_i vs item_j) e.g. HomeTeam AwayTeam HEloWin[HA=0_K=40_ks=400] AEloWin[HA=0_K=40_ks=400]
Wolves West Ham 1500.0000 1500.0000
- item_istr
Item i column name e.g. HomeTeam
- item_jstr
Item j column name e.g. AwayTeam
- ratings_istr
ratings i column e.g. HEloWin[HA=0_K=40_ks=400]
- ratings_jstr
ratings j column e.g. AEloWin[HA=0_K=40_ks=400]
- starting_gameint, default=2
starts from the game number in the plot (axis x).
- items_listOptional[List], default=None
Items to be included in the plot. If None all items will be plot.