ratingslib.datasets.preprocess module
Module for preprocessing functions
- class Preprocess
Bases:
ABC
A base class for preprocessing data
- abstract preprocessing(data_df: DataFrame, col_names_or_dict: Optional[Union[SimpleNamespace, Dict[str, Any]]]) DataFrame
To be overridden in subclasses.
- _abc_impl = <_abc_data object>
- class BasicPreprocess(weeks_to_remove: Optional[List[int]] = None)
Bases:
Preprocess
The basic preprocess class removes from sports-data the games of first match-week.
- _remove_week(data_df, num_list, col_names) DataFrame
Remove match-weeks from a sport dataset according to the list of numbers passed.
- Parameters
data_df (pandas.DataFrame) – Dataset with games
num_list (list) – match weeks to remove e.g. [1,2] the first two weeks will be removed from dataset
col_names (SimpleNamespace) – column names based on dataset. For more details see at
ratingslib.utils.methods.parse_columns()
- Returns
data_df – The modified data
- Return type
pandas.DataFrame
- _abc_impl = <_abc_data object>
- preprocessing(data_df: DataFrame, col_names_or_dict: Optional[Union[SimpleNamespace, Dict[str, Any]]]) DataFrame
Removes the match-weeks and returns the modified dataset.