Monday, October 11, 2010

The Home Run Derby effect?

Do players who participate in the All-Star Game Home Run Derby screw up their swing, go into a slump, and have a poor second half? I've heard this one from the talking heads before, and it sounds completely false to me. Mark McGwire used to put on a show in batting practice in 1998, and it didn't stop him from hitting 70 home runs.

The Home Run Derby data were found from the MLB website, but they weren't in CSV or tab separated format, so I had to do some manipulation. I would have liked to get data on just the few games following the HR Derby to check for slumps, but I settled for the first and second half splits (as determined by the All-Star break) from Baseball-Reference. I used the years 2003-2009.

OPS is a good measure of how effective a hitter is, so I thought it would be best to compare pre- and post-break numbers in terms of OPS (I tried some other things and they led to similar conclusions). Baseball-Reference has a statistic called sOPS+ which measures a player's OPS relative to the league. This controls for season, but since I was looking at first and second half differences, this wasn't too important. I tried it anyway, and it gave almost the exact same results as OPS, so I stuck with OPS.

Although some players appeared in more than one HR Derby between 2003 and 2009, I assumed that the 56 differences between pre-break OPS and post-break OPS were independent. The differences looked Gaussian, and the average pre-break OPS was .958 and the average post-break OPS was .924, leading to a one-sided p-value of 0.02 in the paired t-test - so it's true, the participants do worse after the break! This idea is furthered by the fact that the mean career pre- and post-break OPS for the players are not significantly different - the decrease seems to happen specifically in the year the players compete in the HR Derby.

But... how are hitters selected for the HR Derby? By having a very good first half. The hitters participating are ones who have often done unusually well in the first half, and were heading for a drop-off in the second half whether they took part in the derby or not. I can think of two ways to get around this and answer the question of whether the HR Derby causes the poorer second half. One is to compare the second half OPS in HR Derby years to the second half OPS in non-HR Derby years, and the other is to see if players who take part in more rounds of the derby have a bigger second half drop-off than players who are eliminated early. (I could also look at the second half drop-off for players in the All-Star Game who weren't in the HR Derby, but it was enough of a pain getting the data for just these players, so I'll try to avoid this approach.)

The mean career second half OPS of the 56 HR Derby hitters is .894, and in HR Derby years it is .924. This is still a bit unsatisfactory because the HR Derby year is presumably in the prime of their career, so let's try the second way. Consider the number of swings taken in the competition by each player; this is equal to ten times the number of rounds they were in, plus their HR total. Fitting a linear regression of decrease in OPS on number of HR Derby swings, it is apparent that the more swings the player takes, the less the pre- and post-break OPS difference is, i.e. the opposite of the proposed effect. So I'm pretty comfortable writing the poorer second halves off to selection bias.

Thanks to Bret Hanlon for the idea for this post.

No comments:

Post a Comment