Saturday, April 2, 2011

Run / Pass Play Selection: First Down (Part 2)

As I talked about in my previous post, the run / pass play ratio (for first down, as you can read in the title) depends on field position, time remaining, and point spread.  The field position aspect seems to vary pretty independently; no matter what was the point spread or time remaining, the run / pass play ratio was pretty much that curve you can see below.  Time remaining was only a factor in the second half, and for the last two minutes of the first half.  Also, the relationship of the run / pass play ratio to the point spread depended on time remaining (but again, only in the second half and final two minutes of the first).  If you look at all the curves in my previous post, 0.50 is the midpoint of everything, midfield, 0 point spread, lazy time in the first half, so my predicted run / pass play ratio (for a given scenario) started there.  So through trial and error, here's the model I came up with:

First Half
Predicted Run / Pass Play Ratio = 0.5 + f(Field Position Factor) + f(Point Spread)

Second Half (and Last Two Minutes of the First)
Predicted Run / Pass Play Ratio = 0.5 + f(Field Position Factor) + f(Time Remaining, Point Spread)

f(Field Position Factor) and f(Time Remaining, Point Spread) were both second-order functions.  If anyone's really interested I can show the numbers.

So the question I posed last time was, do run plays work better in pass scenarios, and vice versa?  I have a graph below, showing the average and median run yardage, plotted up against the predicted run / pass play ratio (from all my formulas above).

Run plays work a bit better in passing scenarios (and a bit worse in running scenarios), but it's not an overall strong bias.  And you can see why I included both average and median run yardage per play, averages are universally higher than median values.  It's pretty easy to understand, long-yardage runs weight the average higher.  So while on average the yards per run play is around 4-5 yards, the odds that a run play is going to be longer than 3 yards is about 50/50.  The sheer bias demonstrated above leads me to believe that average yard-per-play is kind of a meaningless statistic.  I mean, think about it this way: say it's third down with 2 yards to go.  The average run yardage is, let's say, 4 yards.  Hey, a pretty good chance at a first down.  However, that average run yardage is based on a bunch of plays that go 1 yard, and a handful of plays that go much longer.  So, on average, your run play will go 4 yards, but that's not the same thing as saying that, yes, on average, you'll convert the third down.

For pass plays, I have two graphs, one is the incompletion rate (per predicted run/ pass play ratio), and the other is average and median pass yardage, assuming it's a completed pass.

The behavior is a little different for pass plays; the success of a pass play doesn't get that worse in passing scenarios, just more chaotic.  In running scenarios, I assume that teams aren't going for long passes. Below I combine the two graphs, average and median pass yardages (assuming, of course, that an incomplete pass is 0 yards).

So if you kind of squint at the data, median yardages for both pass and run plays falls around 3-4 yards.  And this makes sense, generally the run / pass play ratio is 50%, I assume if there was a bias in run / pass yardage that one would be more prevalent than the other.  And finally, this is all for first down; my next posts will cover second and third down.

Thursday, March 24, 2011

Run / Pass Play Selection: First Down

With all this individual play data at my disposal, I've been looking at the trends of play selection, particularly run v. pass (i.e., I'm not looking into field goal / punt frequencies).  My end goal is to come up with a way to roughly estimate under which scenarios a team would pass / run, and the certainty of that estimate.  Once I have a model which works fairly well, it'll allow me to do a few things: (1) I can determine, mathematically, what is a run scenario vs. a pass scenario.  Then, I can look at how well teams perform in each of these scenarios.  For instance, my off-hand example is during a definite passing scenario (down by two scores with 2 minutes left, for example), if a team runs the ball they can probably get about, what, 5 yards or so?  Basically, I can build a basic game theory grid: avg. pass yards vs pass defense, avg. pass yards vs. run defense, etc.  And (2) if I know when, on average, teams pass, I can determine which teams / coaches like to buck the trend, and see if bucking the trend is successful or not.

I'm going to consider modeling run / pass ratios for each down separately.  Logically, I'm starting with the 1st down.  Provided below is average run ratios, broken down by starting field field position.


So you can see a second-order relationship between field position and this ratio.  Also, it looks like the relationship changes around 65 yards, with the slope being more gradual from 65 to 0 yards.  For my purposes, I'll model these two portions of the field separately (0-65, and 65-100).

Next, the relationship between time and run ratio.

 There's a pretty big break at the end of each half, can't tell if there's a big difference between the end of the half and end of the game, right around 10 minutes left.  For now, I'm going both halves the same, except for the last ten minutes (break those out separately).
And finally, here's the relationship between point spread and run ratio.

Pretty linear relationship, breaking down when the spread hits thirty points.  Teams really give up when down by 40 or more.  In fact, I'm just going to go ahead and throw out all these values.

So, in summary, here's what I'm looking at: when the field position is greater and less than 65 (second order relationship); first 20 and last 10 minutes of the half (linear for the first 20, second-order for the last 10); and point spread up/down to 30 (linear).  In the near future, I'll develop some linear regressions through all of these scenarios, so, stay tuned.  I doubt the regressions will say anything earth shattering, but, you know, I can mathematically say how often teams run the ball in the last two minutes.

Sunday, March 13, 2011

Going for it: The Numbers Against Punting

There's this high school football coach somewhere who will always go for it on fourth down.  I can see his reasoning pretty clear in high school, every punt is this adventure of teenage-level talent, of bad kicks and bad coverage and botched long snaps and botched blocking.  He would just rather risk giving the other team the ball at the thirty, and like his chances of converting a fourth-and-whatever.  And in certain situations I can side with him.  I mean, how likely is it that an offense will convert a fourth and inches?  At the very least, I kind of like the wacky play where a team will fake going for it to just draw the other team offsides.  If they fail, take the delay of game penalty.  The scene is a quarterback yelling like mad while all the players pretend to be involved.  I don't know, it's something.  Anyway, in this post I'll go into all the statistics I can think of for when a team should punt or go for it.  If anyone's curious, the data I'm using are from all NFL games from 2002 on, throwing out meaningless plays (taking a knee, hail marys at the end of halves, etc).

Below I've plotted up the average net change in yards (the distance of the punt minus the return yardage) based on field position.  My frame of reference is yards to go for a touchdown; so if a team's on their own 20, then the yards-to-go would be 80.

On average a team will gain roughly 40 yards in field position if they punt.  Things start to break down around 55 yards as touchbacks begin to occur.

So, if a team doesn't go for it, and punts instead, what are the odds that the opposing team would make it back to that original ling of scrimmage anyway?  I've plotted out the odds that the opposing team makes it back to the original punting location, again, broken down by the kicking team yards-to-go.  This data is all based on average drive lengths from a given field position (in this case, the field position dictated by a punt).  I define "making it back" to the original line of scrimmage as the opposing (or receiving) team having a first down past the punting line of scrimmage, or scoring a touchdown.

For the most part, there's about a 35% chance that punting (instead of failing on a fourth-down conversion attempt) will make no difference in net yardage, and a 65% chance that punting is worth it.  Things start to tick up right around mid-field (pretty soon after we start seeing touchbacks appear).  So, generally, the benefit a team sees in punting (instead of just going for the 4th down attempt) starts decaying at mid-field.

Now all the data above is predicated on failure.  What are the odds that a team will actually convert on fourth down?  There's really not enough fourth down attempts to achieve a statistically significant result, so below I've plotted out the third down conversion rates for a number of different yards to go for a first down.  I looked at fourth down conversion rates, and when I exclude the red-zone area, there's not much of a difference from third down.  I went ahead with the third down numbers, as it's a bigger sample set.

 
So for anything a yard or shorter to go, there's nearly a 70% chance that the down is converted, and for anything shorter than 3 yards, that percentage drops to 50%.  If you kind of squint at the data, if a team only has a yard to go they should always go for it.  After that, it's a gray area; but if a team's on the opponent's side of the field, they should more or less never punt if there's only 1-3 yards to go.



Saturday, March 5, 2011

Recent Developments

Obviously it's been awhile since I updated anything.  And with the coming labor disputes, lock-outs, all that, it's an open question when I'll be able to produce any weekly power rankings.  So, with all this time on my hands, I'm going back to the beginning.

I've stumbled across a source of play-by-play data, and worked out a way to bring in all that data and manipulate it.  Average drives, scoring percentages, totaling up runs longer than 10 yards, negative yardage plays, all of that.  In the coming weeks, I'll be able to churn out alot of different statistics, analyses, data models, etc.  So stay tuned.

Tuesday, January 4, 2011

Final Power Rankings (and Playoff Picks)

Below serve my final power rankings of the year.  I decided to take into account week 17 games, there were a number of teams resting their starters at points, but no big swings in win / losses (Jets losing to Buffalo or Atlanta losing to Carolina, for instance).

Rank Team Power Ranking
1 New England Patriots 1.000
2 Atlanta Falcons 0.874
3 Chicago Bears 0.839
4 New York Jets 0.826
5 Baltimore Ravens 0.821
6 Pittsburgh Steelers 0.817
7 Green Bay Packers 0.797
8 Philadelphia Eagles 0.756
9 New Orleans Saints 0.708
10 Indianapolis Colts 0.701
11 Tampa Bay Buccaneers 0.687
12 New York Giants 0.684
13 Kansas City Chiefs 0.654
14 San Diego Chargers 0.654
15 Miami Dolphins 0.652
16 Tennessee Titans 0.643
17 Cleveland Browns 0.640
18 Oakland Raiders 0.629
19 Jacksonville Jaguars 0.629
20 Minnesota Vikings 0.614
21 Detroit Lions 0.613
22 St. Louis Rams 0.597
23 Washington Redskins 0.594
24 Seattle Seahawks 0.586
25 Dallas Cowboys 0.586
26 Houston Texans 0.583
27 San Francisco 49ers 0.558
28 Arizona Cardinals 0.535
29 Buffalo Bills 0.535
30 Cincinnati Bengals 0.529
31 Denver Broncos 0.519
32 Carolina Panthers 0.442


As for playoff picks: I'm basing this solely on power rankings.  No previous years' experience, no home field advantage, no resting of starters, etc.

Saints over Seahawks
Jets over Colts
Packers over Eagles
Ravens over Chiefs

All visiting teams wins, book it.