ratingslib.datasets.filenames module
This module contains filenames of examples and exposes various helper functions for the diretory paths of datasets.
- dataset_path(filename)
Returns the absolute path of filename from dataset dicrectory
- datasets_paths(*filenames)
Returns a tuple of filenames absolute path from dataset directory
- dataset_sports_path(season: int, path: Optional[str] = None, current_dir: Optional[str] = None, championship: str = 'EPL', file_suffix: str = 'footballdata', file_ext: str = '.csv') str
Return filename path of a sports dataset for the given championship and season (e.g. filename of soccer games - English Premier League for the season 2007-2008)
- seasonint
The sport season-year, e.g. 2007 means 2007-2008.
- pathOptional[str], default=None
The path of sports dataset.
- current_dirOptional[str], default=None
The path of current directory.
- championshipstr, default=’EPL’
Championship initials. e.g. EPL for English Premier League. Championships initials are available at
data.parameters.championships
.- file_suffixstr, default=”footballdata”,
Filename suffix (before filename extension).
- file_extstr, default=”.csv”
Filename extension.
- Returns
filename – The absolute path of filename (e.g. 2007-2008EPLfootballdata.csv)
- Return type
str
- datasets_sports_seasons_path(season_start: int, season_end: int = - 1, path: Optional[str] = None, current_dir: Optional[str] = None, championship: str = 'EPL', file_suffix: str = 'footballdata', file_ext: str = '.csv') Dict[int, str]
Return filenames path of a sports dataset for the given championship and season range. The returned type is a dictionary where season number (int) is the key and filename path (str) is the value
- Parameters
season_start (int) – The starting sport season-year, e.g. 2007 means 2007-2008.
season_end (Optional[int], default=-1) – The last sport season-year, e.g. 2017 means 2017-2018. The default value indicates only one season.
path (Optional[str], default=None) – The path of sports dataset.
current_dir (Optional[str], default=None) – The path of current directory.
championship (str, default='EPL') – Championship initials. e.g. EPL for English Premier League. Championships initials are available at
data.parameters.championships
.file_suffix (str, default="footballdata",) – Filename suffix (before filename extension).
file_ext (str, default=".csv") – Filename extension.
- Returns
seasons_dict – Dictionary that maps seasons to csv file links
- Return type
Dict[int, str]
- Raises
ValueError – if
season_start
orseason_end
are not positive integers.
- get_season_footballdata_online(season: int, championship: str = championships.PREMIERLEAGUE)
Get a link of football-data for a csv file for the given championship and sport season. Link example for 2021-2022 English Premier League: https://www.football-data.co.uk/mmz4281/2122/E0.csv
- Parameters
season (int) – The sport season-year, e.g. 2007 means 2007-2008.
championship (Optional[str], default=championships.PREMIERLEAGUE) – Championship initials. e.g. EPL for English Premier League. Championships initials are available at
data.soccer.championships
.
- Returns
link – The csv file link
- Return type
str
- Raises
ValueError – If
season
is not positive intValueError – If
championship
is not listed inFOOTBALL_DATA_LEAGUES
dictionary
- get_seasons_dict_footballdata_online(season_start: int, season_end: int = - 1, championship: str = championships.PREMIERLEAGUE) Dict[int, str]
Get data links of csv files https://www.football-data.co.uk for the given championship and season range.
- Parameters
season_start (int) – The starting sport season-year, e.g. 2007 means 2007-2008.
season_end (Optional[int], default=-1) – The last sport season-year, e.g. 2017 means 2017-2018. The default value indicates only one season.
championship (Optional[str], default=championships.PREMIERLEAGUE) – Championship initials. e.g. EPL for English Premier League. Championships initials are available at
data.soccer.championships
.
- Returns
seasons_dict – Dictionary that maps seasons to csv file links
- Return type
Dict[int, str]
- Raises
ValueError – if
season_start
orseason_end
are not positive integers.
- download_and_store_footballdata(season_start: int, season_end: int = - 1, championship: str = championships.PREMIERLEAGUE, save_path: Optional[str] = None)
Download and store data from www.football-data.co.uk for the given championship and season range.
- Parameters
season_start (int) – The starting sport season-year, e.g. 2007 means 2007-2008.
season_end (Optional[int], default=-1) – The last sport season-year, e.g. 2018 means 2017-2018. The default value indicates only one season.
championship (Optional[str], default=championships.PREMIERLEAGUE) – Championship initials. e.g. EPL for English Premier League. Championships initials are available at
data.soccer.championships
.save_path (Optional[str], default=None) – The path to save csv files
Examples
Download and store csv files from 2005/06 to 2017/18
>>> from ratingslib.datasets.filenames import download_and_store_footballdata >>> from ratingslib.datasets.soccer import championships
>>> epl = championships.PREMIERLEAGUE # ENGLISH PREMIER LEAGUE >>> download_and_store_footballdata(2005, 2018, championship=epl)