Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

something odd with WPA #36

Open
ak47twq opened this issue Oct 26, 2020 · 4 comments
Open

something odd with WPA #36

ak47twq opened this issue Oct 26, 2020 · 4 comments

Comments

@ak47twq
Copy link

ak47twq commented Oct 26, 2020

I compared the diff of two plays' home_wp_post and WPA in the database.
Is WPA suppose to be the diff of two plays' home_wp_post?
Most numbers check out, but some numbers dont make sense.

Why timeOUT has a different home_wp_post?

Here is what i do:

tic()
test<-pbp %>%
         filter(game_id == "2009_18_GB_ARI",!is.na(home_wp_post)) %>%
         select(game_id,play_id, qtr, desc, total, spread_line, home_wp_post, wpa) %>%
         collect()
toc()

tic()
test <- test %>%
     mutate(wp_diff1 = abs(wpa))
toc()

tic()
test[1,'wp_diff2'] = 0

rownum <- nrow(test)

for (i in 2:rownum){
test[i,'wp_diff2']=abs(test[i,'home_wp_post']-test[i-1,'home_wp_post'])
}
toc()

temp<-test%>%filter(wp_diff2!=wp_diff1)

WPA1
WPA2

@ak47twq ak47twq changed the title something odd something odd with WPA Oct 26, 2020
@mrcaseb
Copy link
Member

mrcaseb commented Oct 28, 2020

Here is some more efficient code to reproduce this

pbp %>%
  filter(game_id == "2009_18_GB_ARI", !is.na(home_wp_post)) %>%
  select(game_id, play_id, play_type, desc, home_team, posteam, wp, home_wp, wpa, home_wp_post) %>%
  mutate(
    wp_diff1 = abs(wpa),
    wp_diff2 = abs(home_wp_post - lag(home_wp_post))
  ) %>%
  filter(wp_diff2 != wp_diff1)

output

# A tibble: 4 x 12
  game_id   play_id play_type desc                                home_team posteam    wp home_wp      wpa home_wp_post wp_diff1 wp_diff2
  <chr>       <dbl> <chr>     <chr>                               <chr>     <chr>   <dbl>   <dbl>    <dbl>        <dbl>    <dbl>    <dbl>
1 2009_18_~    1416 no_play   (7:42) J.Kuhn right tackle to ARI ~ ARI       GB      0.153   0.847  0.00151        0.847  0.00151  0      
2 2009_18_~    1437 run       (7:02) A.Rodgers up the middle for~ ARI       GB      0.155   0.845 -0.00730        0.852  0.00730  0.00580
3 2009_18_~    4108 no_play   Timeout #1 by ARI at 01:46.         ARI       GB      0.639   0.361  0              0.361  0        0.278  
4 2009_18_~    4125 pass      (1:46) (Shotgun) K.Warner pass sho~ ARI       ARI     0.639   0.639  0.0216         0.661  0.0216   0.300  

home_wp_post of the play 1416 is modified in this line
https://github.com/mrcaseb/nflfastR/blob/9ae4bb1951a5b4302bc0e3e83261f5bb4406af32/R/helper_add_ep_wp.R#L1011
where home_wp_post is set to the previous value if the current play and the previous play are "no_play"s

The 4108 play appears to have switched home_wp and away_wp.

Any insights @guga31bb ?

@guga31bb
Copy link
Member

This is the equivalent part in nflscrapR and I guess we must have modified it at some point, though I can't remember why. I personally have never used home_wp_post or WPA so I'm surprised we bothered to modify nflscrapR here- there must have been some bug addressed at some point?

@mrcaseb
Copy link
Member

mrcaseb commented Oct 28, 2020

finally found the commit but it's not really informative lol
https://github.com/mrcaseb/fastscraper/commit/12a03f956b313bcf6b247159474100aa93ae7403#diff-0a766e08dadf2046e3cf5c64e0d680e7315073ec7f764627a0d55f68e13136c0

It's line 766-769 in that commit

@guga31bb
Copy link
Member

That commit was mostly me just copy and pasting nflscrapR's part. But it's weird because it doesn't look identical to nflscrapR in that section

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants