ratingslib.datasets.filenames module

This module contains filenames of examples and exposes various helper functions for the diretory paths of datasets.

dataset_path(filename)

Returns the absolute path of filename from dataset dicrectory

datasets_paths(*filenames)

Returns a tuple of filenames absolute path from dataset directory

dataset_sports_path(season: int, path: Optional[str] = None, current_dir: Optional[str] = None, championship: str = 'EPL', file_suffix: str = 'footballdata', file_ext: str = '.csv') str

Return filename path of a sports dataset for the given championship and season (e.g. filename of soccer games - English Premier League for the season 2007-2008)

seasonint

The sport season-year, e.g. 2007 means 2007-2008.

pathOptional[str], default=None

The path of sports dataset.

current_dirOptional[str], default=None

The path of current directory.

championshipstr, default=’EPL’

Championship initials. e.g. EPL for English Premier League. Championships initials are available at data.parameters.championships.

file_suffixstr, default=”footballdata”,

Filename suffix (before filename extension).

file_extstr, default=”.csv”

Filename extension.

Returns

filename – The absolute path of filename (e.g. 2007-2008EPLfootballdata.csv)

Return type

str

datasets_sports_seasons_path(season_start: int, season_end: int = - 1, path: Optional[str] = None, current_dir: Optional[str] = None, championship: str = 'EPL', file_suffix: str = 'footballdata', file_ext: str = '.csv') Dict[int, str]

Return filenames path of a sports dataset for the given championship and season range. The returned type is a dictionary where season number (int) is the key and filename path (str) is the value

Parameters
  • season_start (int) – The starting sport season-year, e.g. 2007 means 2007-2008.

  • season_end (Optional[int], default=-1) – The last sport season-year, e.g. 2017 means 2017-2018. The default value indicates only one season.

  • path (Optional[str], default=None) – The path of sports dataset.

  • current_dir (Optional[str], default=None) – The path of current directory.

  • championship (str, default='EPL') – Championship initials. e.g. EPL for English Premier League. Championships initials are available at data.parameters.championships.

  • file_suffix (str, default="footballdata",) – Filename suffix (before filename extension).

  • file_ext (str, default=".csv") – Filename extension.

Returns

seasons_dict – Dictionary that maps seasons to csv file links

Return type

Dict[int, str]

Raises

ValueError – if season_start or season_end are not positive integers.

get_season_footballdata_online(season: int, championship: str = championships.PREMIERLEAGUE)

Get a link of football-data for a csv file for the given championship and sport season. Link example for 2021-2022 English Premier League: https://www.football-data.co.uk/mmz4281/2122/E0.csv

Parameters
  • season (int) – The sport season-year, e.g. 2007 means 2007-2008.

  • championship (Optional[str], default=championships.PREMIERLEAGUE) – Championship initials. e.g. EPL for English Premier League. Championships initials are available at data.soccer.championships.

Returns

link – The csv file link

Return type

str

Raises
  • ValueError – If season is not positive int

  • ValueError – If championship is not listed in FOOTBALL_DATA_LEAGUES dictionary

get_seasons_dict_footballdata_online(season_start: int, season_end: int = - 1, championship: str = championships.PREMIERLEAGUE) Dict[int, str]

Get data links of csv files https://www.football-data.co.uk for the given championship and season range.

Parameters
  • season_start (int) – The starting sport season-year, e.g. 2007 means 2007-2008.

  • season_end (Optional[int], default=-1) – The last sport season-year, e.g. 2017 means 2017-2018. The default value indicates only one season.

  • championship (Optional[str], default=championships.PREMIERLEAGUE) – Championship initials. e.g. EPL for English Premier League. Championships initials are available at data.soccer.championships.

Returns

seasons_dict – Dictionary that maps seasons to csv file links

Return type

Dict[int, str]

Raises

ValueError – if season_start or season_end are not positive integers.

download_and_store_footballdata(season_start: int, season_end: int = - 1, championship: str = championships.PREMIERLEAGUE, save_path: Optional[str] = None)

Download and store data from www.football-data.co.uk for the given championship and season range.

Parameters
  • season_start (int) – The starting sport season-year, e.g. 2007 means 2007-2008.

  • season_end (Optional[int], default=-1) – The last sport season-year, e.g. 2018 means 2017-2018. The default value indicates only one season.

  • championship (Optional[str], default=championships.PREMIERLEAGUE) – Championship initials. e.g. EPL for English Premier League. Championships initials are available at data.soccer.championships.

  • save_path (Optional[str], default=None) – The path to save csv files

Examples

Download and store csv files from 2005/06 to 2017/18

>>> from ratingslib.datasets.filenames import download_and_store_footballdata
>>> from ratingslib.datasets.soccer import championships
>>> epl = championships.PREMIERLEAGUE  # ENGLISH PREMIER LEAGUE
>>> download_and_store_footballdata(2005, 2018, championship=epl)