[TRANSFORMERS] to_df(orient="rows")#475
[TRANSFORMERS] to_df(orient="rows")#475UnravelSports wants to merge 20 commits intoPySport:masterfrom
Conversation
…change row to rows and column to columns for orient
|
As discussed on the Kloppy Dev call, I've changed |
|
Close #68 |
|
I've significantly refactored this one. Main changes:
Does this look good, @UnravelSports? |
|
Thanks @probberechts! from kloppy import skillcorner
dataset = skillcorner.load_open_data()
dataset.to_df(engine="polars", layout="long")Throws an error because team_id is a mix of integers and string ('ball'), ball_owning_team_id is only int. There is two ways to handle this, convert all team_id's to string or setting We can do @koenvo any thoughts on this? |
|
I think here are the options available to us to resolve the above:
|
This has been on my list for a long time...
This addition allows users to do
tracking_dataset.to_df(orient="row")(default remains orient="column").This will only work for
TrackingDatasetand it returns a DataFrame with the following columns:[ "period_id", "timestamp", "frame_id", "ball_state", "ball_owning_team_id", "team_id", "player_id", "x", "y", "z", "d", "s"]team_id and player_id are
"ball"for the ball object.Each key in
frame.other_datagets their own column as well, currently that would only be "visible_area" if we convert StatsBomb Freeze Frames to a TrackingDataset as discussed in Issue #474.This PR adds:
RowWiseFrameTransformertokloppy/domain/services/transformers/attribute.pyto_dict_rowwiseto theDatasetwith an error if we try to run orient="row" on anything that is not a TrackingDatasetto_dfupdate to allow fororient="row"andorient="column"test_sportecEdit: I'm contemplating if we should change "row" to "rows" and "column" to "columns"
For future reference: Because of the janky player Ids in the StatsBomb freeze frames we can't convert StatsBomb TrackingData (created from freeze frames) into a
to_df(orient="columns"). That's why we needorient="rows"