Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SkillCorner extrapolated_data.jsonl loading functionality #215

Closed
wants to merge 13 commits into from
11 changes: 10 additions & 1 deletion docs/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,13 @@ is nobody already working on the same issue and to ensure your time as a contrib

## How to Contribute

There are two ways to contribute:

1. [Contributing to Code](#contributing-to-code)
2. [Contributing to Documentation](#contributing-to-documentation)

## Contributing to Code

All code changes happen through Pull Requests. If you would like to contribute, follow the steps below to set up
the project and make changes:

Expand Down Expand Up @@ -55,7 +62,7 @@ follow these instructions:
files. *Note*: if _black_ needs to re-format a file, the commit will fail, meaning you will then need to execute
`git add .` and `git commit` again to commit the files updated by _black_.

## Documentation
## Contributing to Documentation

This project uses [MkDocs](https://www.mkdocs.org/) to generate documentation from pages written in Markdown.

Expand All @@ -71,6 +78,8 @@ mkdocs serve

Open up [http://127.0.0.1:8000/](http://127.0.0.1:8000/) in your browser to preview your documentation.

Use `mkdocs.yml` to add new page references or update the file layout.

## Contributors (sorted alphabetically)

Many thanks to the following developers for contributing to this project:
Expand Down
58 changes: 58 additions & 0 deletions docs/functionality/coordinate-systems.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
Coordinate system options for data loading functionality.

Reference: `kloppy/domain/models/common.py`

## Kloppy `"kloppy"`
- **Origin:** Top Left
- **Vertical Orientation:** Top to Bottom
- **Pitch Dimensions:** Ranges from 0 to 1 on both axes, e.g., x-dim: [0, 1] and y-dim: [0, 1]

## Metrica `"metrica"`
- **Origin:** Top Left
- **Vertical Orientation:** Top to Bottom
- **Pitch Dimensions:** Ranges from 0 to 1 on both axes, e.g., x-dim: [0, 1] and y-dim: [0, 1]

## Tracab `"tracab"`
- **Origin:** Center
- **Vertical Orientation:** Bottom to Top
- **Pitch Dimensions:** Scaled to `self.length` and `self.width` in hundreds with the origin at the center, e.g., x-dim: [-5250, 5250] and y-dim: [-3400, 3400]

## Second Spectrum `"secondspectrum"`
- **Origin:** Center
- **Vertical Orientation:** Bottom to Top
- **Pitch Dimensions:** Defined by `self.length` and `self.width` with the origin at the center, e.g., x-dim: [-52.5, 52.5] and y-dim: [-34, 34]

## Opta `"opta"`
- **Origin:** Bottom Left
- **Vertical Orientation:** Bottom to Top
- **Pitch Dimensions:** Fixed at x-dim: [0, 100] and y-dim [0, 100].

## Sportec `"sportec"`
- **Origin:** Bottom Left
- **Vertical Orientation:** Top to Bottom
- **Pitch Dimensions:** Defined by `self.length` and `self.width`, e.g., x-dim: [0, 105] and y-dim: [0, 68]

## StatsBomb `"statsbomb"`
- **Origin:** Top Left
- **Vertical Orientation:** Top to Bottom
- **Pitch Dimensions:** Fixed at 120 x 80 units. So, x-dim: [0, 120], y-dim: [0, 80].

## Wyscout `"wyscout"`
- **Origin:** Top Left
- **Vertical Orientation:** Top to Bottom
- **Pitch Dimensions:** Fixed at x-dim: [0, 100] and y-dim [0, 100].

## SkillCorner `"skillcorner"`
- **Origin:** Center
- **Vertical Orientation:** Bottom to Top
- **Pitch Dimensions:** Defined by `self.length` and `self.width` with the origin at the center, e.g., x-dim: [-52.5, 52.5] and y-dim: [-34, 34]

## Datafactory `"datafactory"`
- **Origin:** Center
- **Vertical Orientation:** Top to Bottom
- **Pitch Dimensions:** Ranges from -1 to 1 on both axes. So, x-dim: [-1, 1], y-dim: [-1, 1].

## StatsPerform `"statsperform"`
- **Origin:** Bottom Left
- **Vertical Orientation:** Bottom to Top
- **Pitch Dimensions:** Defined by `self.length` and `self.width`, e.g., x-dim: [0, 105] and y-dim: [0, 68]
221 changes: 221 additions & 0 deletions docs/functionality/event-data.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,221 @@
## Supported Event Data Providers

- [DataFactory](#datafactory)
- [Metrica](#metrica)
- [Opta](#opta)
- [Sportec](#sportec)
- [SportsCode](#sportscode)
- [StatsBomb](#statsbomb)
- [WyScout](#wyscout)


### DataFactory

#### load
`kloppy.kloppy._providers.datafactory.load(event_data, event_types=None, coordinates=None, event_factory=None)`

This function loads DataFactory event data into an `EventDataset`.

##### Parameters
- `event_data: FileLike`: This should be the filename (or another file-like object) of the JSON file that contains the events to be loaded. This JSON file should follow the DataFactory's specific format.

##### Optional Parameters
- `event_types: Optional[List[str]] = None`: This is an optional parameter where you can specify a list of the types of events you're interested in. If this is `None`, then all event types in the data will be loaded.
- `coordinates: Optional[str] = None`: An optional parameter for specifying the coordinate system to be used. The default is `None`, which means the default coordinate system of the data will be used.
- `event_factory: Optional[EventFactory] = None`: An optional `EventFactory` object that will be used to create the events. If this is `None`, then the default `EventFactory` specified in the configuration (via `get_config("event_factory")`) will be used.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded events.

Please consult the `EventFactory` and `EventDataset` documentation for more details on these classes.

---

### Metrica

### load_event
`kloppy.kloppy._providers.datafactory.load_event(event_data, meta_data, event_types=None, coordinates=None, event_factory=None)`

This function loads event data into an `EventDataset`.

##### Parameters
- `event_data: FileLike`: The filename (or another file-like object) of the file that contains the event data.
- `meta_data: FileLike`: The filename (or another file-like object) of the file that contains the meta data.

##### Optional Parameters
- `event_types: Optional[List[str]] = None`: A list of the types of events to load. If `None`, all events will be loaded.
- `coordinates: Optional[str] = None`: An optional parameter for specifying the coordinate system to be used. The default is `None`, which means the default coordinate system of the data will be used. See [Coordinate Systems](../coordinate-systems/) for more information.
- `event_factory: Optional[EventFactory] = None`: An optional `EventFactory` object that will be used to create the events. If this is `None`, then the default `EventFactory` specified in the configuration (via `get_config("event_factory")`) will be used.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded event data.

Please consult the `EventFactory` and `EventDataset` documentation for more details on these classes.

---

### Opta

### load
`kloppy.kloppy._providers.opta.load(f7_data, f24_data, event_types=None, coordinates=None, event_factory=None)`

This function loads Opta event data into an `EventDataset`.

##### Parameters
- `f7_data: FileLike`: The filename (or another file-like object) of the file that contains the F7 Opta events data.
- `f24_data: FileLike`: The filename (or another file-like object) of the file that contains the F24 Opta lineup information.

##### Optional Parameters
- `event_types: Optional[List[str]] = None`: A list of the types of events to load. If `None`, all events will be loaded.
- `coordinates: Optional[str] = None`: An optional parameter for specifying the coordinate system to be used. The default is `None`, which means the default coordinate system of the data will be used. See [Coordinate Systems](../coordinate-systems/) for more information.
- `event_factory: Optional[EventFactory] = None`: An optional `EventFactory` object that will be used to create the events. If this is `None`, then the default `EventFactory` specified in the configuration (via `get_config("event_factory")`) will be used.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded event data.

Please consult the `EventFactory` and `EventDataset` documentation for more details on these classes.

---

### Sportec

### load
`kloppy.kloppy._providers.sportec.load(f7_data, f24_data, event_types=None, coordinates=None, event_factory=None)`

This function loads Opta event data into an `EventDataset`.

##### Parameters
- `event_data: FileLike`: The filename (or another file-like object) of the file that contains the events.
- `meta_data: FileLike`: The filename (or another file-like object) of the file that contains the match information.

##### Optional Parameters
- `event_types: Optional[List[str]] = None`: A list of the types of events to load. If `None`, all events will be loaded.
- `coordinates: Optional[str] = None`: An optional parameter for specifying the coordinate system to be used. The default is `None`, which means the default coordinate system of the data will be used. See [Coordinate Systems](../coordinate-systems/) for more information.
- `event_factory: Optional[EventFactory] = None`: An optional `EventFactory` object that will be used to create the events. If this is `None`, then the default `EventFactory` specified in the configuration (via `get_config("event_factory")`) will be used.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded event data.

Please consult the `EventFactory` and `EventDataset` documentation for more details on these classes.

---

### Sportscode

### load
`kloppy.kloppy._providers.sportscode.load(data)`

This function loads SportsCode data into a `CodeDataset`.

##### Parameters
- `data: str`: The filename (or a file-like object) of the SportsCode data file to load.

##### Returns
- `CodeDataset`: An instance of the `CodeDataset` class, filled with the loaded SportsCode data.

Please consult the `CodeDataset` documentation for more details on these classes.

---

### save
`kloppy.kloppy._providers.sportscode.save(dataset, output_filename)`

This function saves a `CodeDataset` to a SportsCode data file.

##### Parameters
- `dataset: CodeDataset`: The `CodeDataset` instance to save.
- `output_filename: str`: The name of the file to save the dataset to.

##### Returns
This function does not return any value.

Note: The data is written in binary mode to the specified file.

Please consult the `CodeDataset` documentation for more details on this class.

---

### StatsBomb

### load
`kloppy.kloppy._providers.statsbomb.load(event_data, lineup_data, three_sixty_data=None, event_types=None, coordinates=None, event_factory=None)`

This function loads StatsBomb event data into an `EventDataset`.

##### Parameters
- `event_data: FileLike`: The filename (or another file-like object) of the file containing the events.
- `lineup_data: FileLike`: The filename (or another file-like object) of the file containing the lineup information.

##### Optional Parameters
- `three_sixty_data: Optional[FileLike] = None`: The filename (or another file-like object) of the file containing the 360 data. If this is not provided, the function will still run, but without the 360 data.
- `event_types: Optional[List[str]] = None`: A list of the types of events to load. If `None`, all events will be loaded.
- `coordinates: Optional[str] = None`: The coordinate system to be used. The default is `None`. See [Coordinate Systems](../coordinate-systems/) for more information.
- `event_factory: Optional[EventFactory] = None`: The `EventFactory` that will be used to create the events. If `None`, the default `EventFactory` will be used.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded event data.

---

### load_open_data
`kloppy.kloppy._providers.statsbomb.load_open_data(match_id='15946', event_types=None, coordinates=None, event_factory=None)`

This function loads StatsBomb public data into an `EventDataset`.

##### Parameters
- `match_id: Union[str, int] = '15946'`: The ID of the match to be loaded.

##### Optional Parameters
- `event_types: Optional[List[str]] = None`: A list of the types of events to load. If `None`, all events will be loaded.
- `coordinates: Optional[str] = None`: The coordinate system to be used. The default is `None`. See [Coordinate Systems](../coordinate-systems/) for more information.
- `event_factory: Optional[EventFactory] = None`: The `EventFactory` that will be used to create the events. If `None`, the default `EventFactory` will be used.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded event data.

Please consult the `EventFactory` and `EventDataset` documentation for more details on these classes.

##### Note
By using this function, you agree to the StatsBomb public data user agreement, which can be found [here](https://github.com/statsbomb/open-data/blob/master/LICENSE.pdf).

---

### WyScout

### load
`kloppy.kloppy._providers.wyscout.load(event_data, event_types=None, coordinates=None, event_factory=None, data_version=None)`

This function loads Wyscout event data into an `EventDataset`.

##### Parameters
- `event_data: FileLike`: The filename (or another file-like object) of the XML file containing the events and metadata.

##### Optional Parameters
- `event_types: Optional[List[str]] = None`: A list of the types of events to load. If `None`, all events will be loaded.
- `coordinates: Optional[str] = None`: The coordinate system to be used. The default is `None`. See [Coordinate Systems](../coordinate-systems/) for more information.
- `event_factory: Optional[EventFactory] = None`: The `EventFactory` that will be used to create the events. If `None`, the default `EventFactory` will be used.
- `data_version: Optional[str] = None`: The version of the data to load. If `None`, the deserializer will be automatically identified.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded event data.

---

### load_open_data
`kloppy.kloppy._providers.wyscout.load_open_data(match_id='2499841', event_types=None, coordinates=None, event_factory=None)`

This function loads Wyscout open data into an `EventDataset`.

##### Parameters
- `match_id: Union[str, int] = '2499841'`: The ID of the match to be loaded.

##### Optional Parameters
- `event_types: Optional[List[str]] = None`: A list of the types of events to load. If `None`, all events will be loaded.
- `coordinates: Optional[str] = None`: The coordinate system to be used. The default is `None`. See [Coordinate Systems](../coordinate-systems/) for more information.
- `event_factory: Optional[EventFactory] = None`: The `EventFactory` that will be used to create the events. If `None`, the default `EventFactory` will be used.

##### Returns
- `EventDataset`: An instance of the `EventDataset` class, filled with the loaded event data.

Please consult the `EventFactory` and `EventDataset` documentation for more details on these classes.

43 changes: 43 additions & 0 deletions docs/functionality/providers.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Supported providers

Kloppy supports two types of providers:

1. [Event data providers](#event-data-providers)
2. [Tracking data providers](#tracking-data-providers)

### Event data providers

kloppy tries to support as many as possible features for each provider. The table below show what kloppy supports per provider. Each provider might or might not include more information in their files.

Please [open a ticket](https://github.com/PySport/kloppy/issues) when you like to implement additional features.

||| Datafactory | Metrica | Opta | Sportec | Statsbomb | Wyscout |
|-|-|:-:|:-:|:-:|:-:|:-:|:-:|
|**File format**||JSON|JSON|XML|XML|JSON|JSON|
|**Event types**|
||[Pass][kloppy.domain.models.event.PassEvent]|✓|✓|✓|✓|✓|✓|
||[Shot][kloppy.domain.models.event.ShotEvent]|✓|✓|✓|✓|✓|✓|
||[TakeOn][kloppy.domain.models.event.TakeOnEvent]||✓|✓||✓|✓|✓|
||[Carry][kloppy.domain.models.event.CarryEvent]||✓|||✓||
||[Substitution][kloppy.domain.models.event.SubstitutionEvent]|✓|||✓|✓||
||[PlayerOn][kloppy.domain.models.event.PlayerOnEvent]/[Off][kloppy.domain.models.event.PlayerOffEvent]|||||✓||
||[Card][kloppy.domain.models.event.CardEvent]|✓|||✓|✓|✓|
||[Recovery][kloppy.domain.models.event.RecoveryEvent]|✓|✓|✓|✓|✓|✓|
||[BallOut][kloppy.domain.models.event.BallOutEvent]|✓[^2]|✓|✓|✓[^2]|✓|✓|
||[FoulCommitted][kloppy.domain.models.event.FoulCommittedEvent]|✓|✓|✓|✓|✓|✓|
||[Generic][kloppy.domain.models.event.GenericEvent][^1]|✓|✓|✓|✓|✓|✓|s
|**Qualifiers**|
||SetPiece|✓[^3]|✓[^3]|✓[^3]|✓[^3]|✓[^3]|✓[^3]
||BodyPart||`Head`|`Head` `RightFoot` `LeftFoot` `Other`|`Head` `RightFoot` `LeftFoot`|`Chest` `Head` `RightFoot` `LeftFoot` `Other` [^4]|`RightFoot` `LeftFoot`
||PassType|||`Cross` `LongBall` `ThroughBall` `Launch` `ChippedBall` `Assist` `2nd Assist` `SwitchOfPlay` |||`Cross` `Hand` `Head` `High` `Launch` `Simple` `Smart`

[^1]: All other event types
[^2]: Synthetic event generated by kloppy
[^3]: Full support means support for these types: `Corner` `FreeKick` `Penalty` `ThrowIn` `KickOff` `GoalKick`
[^4]: All body parts can be found here: https://github.com/statsbomb/open-data/blob/master/doc/StatsBomb%20Open%20Data%20Specification%20v1.1.pdf

### Tracking data providers

||| Metrica | SecondSpectrum | SkillCorner | StatsPerform | Tracab
|-|-|:-:|:-:|:-:|:-:|:-:|
|**File format**||CSV, EPTS|JSONL|JSON|TXT|DAT|
Loading