{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Tuning ELO parameters (2009-2010 EPL)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this example we are tuning Elo parameters in order to improve accuracy. The tuning is performed by grid-search after defining the search space of each parameter." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Load packages" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "from ratingslib.app_sports.methods import Predictions, prepare_sports_seasons\n", "from ratingslib.application import SoccerOutcome\n", "from ratingslib.datasets.filenames import get_seasons_dict_footballdata_online\n", "from ratingslib.datasets.parameters import championships\n", "from ratingslib.ratings.elo import Elo\n", "from ratingslib.utils.enums import ratings\n", "from sklearn.naive_bayes import GaussianNB" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Set target outcome" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "outcome = SoccerOutcome()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Get the filename from football-data.co.uk for season 2009-2010 (English Premier League)." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "filename = get_seasons_dict_footballdata_online(\n", " season_start=2009, season_end=2010, championship=championships.PREMIERLEAGUE)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We set the version list which contains the Elo-Win and Elo-Point version. Then, we create a dictionary that maps all possible combinations of the ranges for each parameter we have set." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "dict_keys(['EloWin[HA=70_K=10_ks=100]', 'EloPoint[HA=70_K=10_ks=100]', 'EloWin[HA=80_K=10_ks=100]', 'EloPoint[HA=80_K=10_ks=100]', 'EloWin[HA=70_K=10_ks=200]', 'EloPoint[HA=70_K=10_ks=200]', 'EloWin[HA=80_K=10_ks=200]', 'EloPoint[HA=80_K=10_ks=200]', 'EloWin[HA=70_K=20_ks=100]', 'EloPoint[HA=70_K=20_ks=100]', 'EloWin[HA=80_K=20_ks=100]', 'EloPoint[HA=80_K=20_ks=100]', 'EloWin[HA=70_K=20_ks=200]', 'EloPoint[HA=70_K=20_ks=200]', 'EloWin[HA=80_K=20_ks=200]', 'EloPoint[HA=80_K=20_ks=200]'])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "version_list = [ratings.ELOWIN, ratings.ELOPOINT]\n", "ratings_dict = Elo.prepare_for_gridsearch_tuning(version_list=version_list,\n", " k_range=[10, 20],\n", " ks_range=[100, 200],\n", " HA_range=[70, 80])\n", "ratings_dict.keys()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The ratings in the dataset start from the second match week." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Load season: 2009 - 2010\n", "2.9%5.7%8.6%11.4%14.3%17.1%20.0%22.9%25.7%28.6%31.4%34.3%37.1%40.0%42.9%45.7%48.6%51.4%54.3%57.1%60.0%62.9%65.7%68.6%71.4%74.3%77.1%80.0%82.9%85.7%88.6%91.4%94.3%97.1%100.0%\n" ] } ], "source": [ "data = prepare_sports_seasons(filename,\n", " outcome,\n", " rating_systems=ratings_dict,\n", " start_week=2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We test three diffent methods (RANK, MLE, and the Naive Bayes classifier) and we start making predictions from the 4th week.\n", "We apply the anchored walk-farward procedure with window size = 1 which means that every week we make predictions\n", "by using previous weeks data for training set. For example, for the 4th week, the training set is consisted of the 1st, 2nd and 3rd week.\n", "The best parameters for each method and for each version are printed in the console." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "\n", "\n", "\n", "=====Prediction method: GaussianNB=====\n", "EloWin[HA=70_K=20_ks=100] 0.5203488372093024\n", "\n", "\n", "=====Prediction method: GaussianNB=====\n", "EloPoint[HA=70_K=10_ks=200] 0.5290697674418605\n", "\n", "\n", "=====Prediction method: MLE=====\n", "EloWin[HA=80_K=20_ks=100] 0.5377906976744186\n", "\n", "\n", "=====Prediction method: MLE=====\n", "EloPoint[HA=80_K=20_ks=100] 0.5465116279069767\n", "\n", "\n", "=====Prediction method: RANK=====\n", "EloWin[HA=80_K=10_ks=100] 0.49127906976744184\n", "\n", "\n", "=====Prediction method: RANK=====\n", "EloPoint[HA=70_K=20_ks=100] 0.5087209302325582\n" ] } ], "source": [ "prediction_methods = [GaussianNB(), 'MLE', 'RANK']\n", "print()\n", "for predict_with in prediction_methods:\n", " best = Predictions(data, outcome, start_from_week=4, print_accuracy_report=False).rs_tuning_params(\n", " ratings_dict=ratings_dict, predict_with=predict_with,\n", " metric_name='accuracy')\n" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3.8.13 ('py38')", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.8.13" }, "orig_nbformat": 4, "vscode": { "interpreter": { "hash": "23ceb7112fbf9d0e38ecbf60d6e6d5e2dcebcc82200eeb1e5a5d5f9ffb9e27ca" } } }, "nbformat": 4, "nbformat_minor": 2 }