ratingslib.datasets.preprocess module

Module for preprocessing functions

class Preprocess

Bases: ABC

A base class for preprocessing data

abstract preprocessing(data_df: DataFrame, col_names_or_dict: Optional[Union[SimpleNamespace, Dict[str, Any]]]) DataFrame

To be overridden in subclasses.

_abc_impl = <_abc_data object>
class BasicPreprocess(weeks_to_remove: Optional[List[int]] = None)

Bases: Preprocess

The basic preprocess class removes from sports-data the games of first match-week.

_remove_week(data_df, num_list, col_names) DataFrame

Remove match-weeks from a sport dataset according to the list of numbers passed.

Parameters
  • data_df (pandas.DataFrame) – Dataset with games

  • num_list (list) – match weeks to remove e.g. [1,2] the first two weeks will be removed from dataset

  • col_names (SimpleNamespace) – column names based on dataset. For more details see at ratingslib.utils.methods.parse_columns()

Returns

data_df – The modified data

Return type

pandas.DataFrame

_abc_impl = <_abc_data object>
preprocessing(data_df: DataFrame, col_names_or_dict: Optional[Union[SimpleNamespace, Dict[str, Any]]]) DataFrame

Removes the match-weeks and returns the modified dataset.