Lol Predictions

So, I'm a big league of legends fan, I don't play as much as I'd like but I like to think that I have mildly good instincts to guess the winner in a professional match. However, as a data scientist I think that it's very possible to create a model to predict a match outcome.

My initial "predict lol matches" google search yielded some interesting results, the one that caught my eye is an implementation that uses a ranked matches kagle dataset, and uses xgboost and shap to both fit a model and explain it. Here's the reference notebook.

Since my go-to programming language is python, the implementation is in python.

About the repo

I'm trying to be more organized and leave the tedious stuff to other tools, that's why this project uses poetry to handle dependency management.

One nice thing about poetry is that it can run scripts really easy, like the very simple tests currently implemented.

Setting up the project

To setup the project you need to have poetry in your system, since I'm on mac, my prefered way to do it is using homebrew.

# use homebrew to install poetry
brew install poetry

# install the dependencies
poetry install

# run tests
poetry run pytest

Data gathering

To gather match data there's a very simplified version of a "match scraper" (using quotes, because it's not really scraping). Most of the heavy lifting is done by the riotwatcher python package.

To use the MatchScraper class, you only need to set the environment variable RIOT_API_KEY to a valid riot API key, you can get a free 24 hour one from the riot developer portal, or submit a request for a long-lived key.

The implementation in this repo is done in the get_matches.py module, currently this script runs inside a raspberry pi gathering data and saving it to an s3 bucket using the S3Helper class.

Data processing

Currently three data sets are being saved to an s3 bucket: matches, teams and participants. Once the data is available in s3, databricks is used to process the raw data and save it to the s3 bucket again.

The data pipeline can be found in the notebooks directory, currently there's only one version of the dataset, but that's enough to get started.

Prediction model (WIP)

So far this stage is in very early development, we're trying to accomplish a full end to end model deployment, and to acomplish it we'll be using mlflow. We've setup an mlflow tracking server on aws using EC2, but that's mainly the fancy part.

For modeling we've though of submitting pull requests that train a model and report on model results, this to evaluate the performance of models and make a decision on which might be the right one for the job.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github		.github
databricks		databricks
lol_matches		lol_matches
tests		tests
.gitignore		.gitignore
README.md		README.md
get_matches.py		get_matches.py
poetry.lock		poetry.lock
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Lol Predictions

About the repo

Setting up the project

Data gathering

Data processing

Prediction model (WIP)

About

Releases

Packages

Languages

keanrawr/Lol-Matches

Folders and files

Latest commit

History

Repository files navigation

Lol Predictions

About the repo

Setting up the project

Data gathering

Data processing

Prediction model (WIP)

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages