ratingslib.utils.methods module

Helper functions for the project

parse_columns(columns_dict: Optional[Dict[str, Any]] = None) SimpleNamespace

Return SimpleNamespace for the dictionary of columns

Parameters

columns_dict (Optional[Dict[str, str]], default=None) – A dictionary mapping the column names of the dataset. If None is given, COLUMNS_DICT will be used See the module ratingslib.datasets.parameters for more details.

Returns

n – A simple object subclass that provides attribute access to its namespace. Attributes are the keys of columns_dict.

Return type

SimpleNamespace

create_items_dict(items_df: DataFrame) Dict[Any, int]

Create dictionary containing all items

Parameters

items_df (pandas.DataFrame) – Set of items (e.g. teams)

Returns

items_dict – Dictionary that maps items’ names to integer value. For instance in the case of soccer teams

items_dict = {'Arsenal': 0,
              'Aston Villa': 1,
              'Birmingham': 2,
              'Blackburn': 3
              }

Return type

Dict[Any, int]

get_indices(*args, data=None, data_columns_dict=None) Tuple[int, ...]

Return indices for variable length arguments

points(row, home_points_col_index: int, away_points_col_index: int) Tuple[int, int]

Return points ij, ji for the given pair. The term points depends on the application type. In soccer indicates goals.

Parameters
  • items_df (pandas.DataFrame) – Set of items (e.g. teams)

  • home_points_col_index (int) – Column index of home item scores (points)

  • away_points_col_index (int) – Column index of away item scores (points)

Returns

  • points_ij (int) – The number of points that item i scores against item j

  • points_ji (int) – The number of points that item j scores against item i

indices(row: ndarray, items_dict: Dict[Any, int], home_col_index: int, away_col_index: int) Tuple[int, int]

Return indices i,j for the given pair. Indices are the keys of items contained in items_dict

Parameters
  • row (numpy.ndarray) – Details (names, scores, etc) of the pair of items.

  • items_df (pandas.DataFrame) – Set of items (e.g. teams)

  • home_col_index (int) – Column index of home item name

  • away_points_col_index (int) – Column index of away item name

Returns

  • i (int) – Index number of home team

  • j (int) – Index number of away team

indices_and_points(row, items_dict: Dict[Any, int], home_col_index: int, away_col_index: int, home_points_col_index: int, away_points_col_index: int) Tuple[int, int, int, int]

Return indices i,j and points ij,ji for the given pair

clean_args(parameters: dict, name: Callable) dict

Select parameters that are valid for a function

Parameters
  • parameters (dict) – Dictionary of parameters.

  • name (Callable) – Name of the function.

Returns

fparams – Dictionary of valid parameters

Return type

dict

set_options_pandas()

Set option for the presentation of pandas output

log_pandas(pandas_df: DataFrame, msg: str = '')

Print (by logging function) a pandas dataframe with options for better presentation. Warning: set logging.level <= logmsg.PANDAS to show logs.

Parameters

pandas_df (DataFrame) – A dataframe for logging.

print_pandas(pandas_df: DataFrame, msg: str = '')

Print (in console) a pandas dataframe with options for better presentation.

Parameters

pandas_df (pandas.DataFrame) – A dataframe for printing.

set_options_numpy(decimals: int)

Set display options for numpy matrices.

log_numpy_matrix(matrix: ndarray, msg: str = '', decimals: int = 2)

Print (by logging function) a numpy array with options for better presentation. Warning: set logging.level <= logmsg.MATRIX to show logs.

str_info(name: str) str

Return pretty format of a string

print_info(name: str)

Pretty print a string

print_loading(loading_percentage: float, demicals: int = 1, end: str = '')

Print the loading percentage of progress bar in a formated string

get_filename(filename: str, directory_path: Optional[str] = None, current_dirname: Optional[str] = None, check_if_not_exists: bool = True) str

Return the filename path.

Parameters
  • filename (str,) – The name of file or file path+filename

  • directory_path (str, default=None) – The path where the file is stored

  • current_dirname (str, default=None) – The path of current directory

  • check_if_not_exists (bool, default=True) – If file not exists and check_if_not_exists is True don’t raise error

Returns

str

Return type

the absolute path of filename.

get_filenames(*filenames, directory_path: str, current_dirname: str) Tuple[str, ...]

Return the filename path. If any of files not exists then raise an error.

Parameters
  • filenames (variable length list of str) – The name of files or file path+filename

  • directory_path (str, default=None) – The path where the files are stored

  • current_dirname (str, default=None) – The path of current directory

Returns

paths_and_filenames_tuple – The absolute paths of filenames

Return type

tuple

concat_csv_files_without_header(filenames: List[str], outputfilename: str)

Concatenation of csv files into new one without headers (keeps only headers in the beginning of the csv)

Parameters
  • filename_list (List[str]) – List of names of files.

  • outputfilename (str) – Filename of new file.