Many well-known fantasy analysts use statistical models or machine learning to forecast points per game (PPG) for both season-long and weekly predictions. While I’m not nearly as experienced as some of them, I have been putting together my own models over the last few years with meaningful results that can help us decide where to take and avoid risks for the upcoming year.
Why should you care? This process gives you another tool to check against your rankings, highlighting players with higher ceilings or lower floors that you might not have previously expected. It takes a player’s stats from the previous year and applies multiple methods with significant correlation to PPG – multiple linear regression (MLR) and data binning with boundary conditions – and uses the average of both. Throughout my research, this has proven to be an effective method of neutralizing outlier or “bad” projections from one model because it is balancing it with often opposing projections from the other model. This improves accuracy and gives us a good range of outcomes for each current running back. Using these results, we can project for the upcoming year and note the resultant trends of each player’s career arc for dynasty.
Let’s take a look at what makes up each method to identify its strengths and weaknesses. You can skip to the 2020 projections section if the methods aren’t of interest to you.
Multiple Linear Regression (MLR)
A staple among most projection models, MLR identifies one-to-one trends among each stat relative to fantasy PPG. For my model, instead of using every stat category available which can create noise in the data, I only used ones that carried statistical significance with a p-value less than 0.05 for over three years. The resultant categories consisted of the following that have a direct, linear correlation to PPG:
- Age (Youth)
- Rushing Yards Per Game
- Rushing to Receiving Touch Ratio
How simple. With an R^2 ranging from 0.40 to 0.58 over the last few years, we have a proven predictive measure. While extremely useful, it quickly becomes apparent that more is needed than three categories. That’s where our next method comes in. Looking back on the MLR results, it projects poorly for young, satellite receiving backs such as Tarik Cohen, James White, and Jalen Richard. Seeing this trend is advantageous as we can keep it in mind when going through our 2020 results.
Data Binning with Boundary Conditions (includes dampening for efficiency metrics)
This one is far harder to explain. Data binning involves assigning a certain value to a unique range of numbers. E.g. Running backs with at least four receptions a game should expect to see a six-point boost to their projections, and applying that across any running back with four or more catches no matter how many they got. In its simplest form, a chart of data binning would look like a staircase with steps up at each threshold marker.
To correct this obvious oversight, boundary conditions were added to each bin. This means there’s a specific equation for each statistical bin. Some are linear, some are exponential, etc. This helps capture data that matters more accurately than linear regression as it tracked data trends year after year so see what matters, how much, and at what thresholds. It doesn’t look as sleek and sexy as MLR does, but it has proven to be more predictive.
Finally, we get to dampening efficiency metrics. This is one of the most useful parts of this method because it foresees outliers in efficiency regressing to league averages, as almost all do. These include yards per carry, yards per reception, and touchdown rates. Now some players are proven more efficient than others on a consistent basis, so it takes that into account as well and balances it on a per-player basis along with league averages.
This method does indeed use nearly all basic stats from the season prior, simply because we can classify them better than a straight line as in MLR. Touches, yards per game, yards per touch, touchdown rates, etc. The weakness of this model is players who hit multiple stat categories right at defining thresholds.
These models are meant to be complementary to fantasy projections, not your foundation. Human interpretation is needed as backfield depth charts are constantly changing, rookies are being added, or offensive lines are falling apart. Let’s take last year’s results for example:
As you can see, the 2019 projections flowed generally well between projected and actual PPG with an R^2 of 0.56. Nice.
Looking at a few of the biggest model misses from last year, we come across guys like Todd Gurley, Damien Williams, Mark Ingram, and Jalen Richard. Let’s see if the misses were predictable.
Gurley saw a massive drop. There were fears over his arthritic knee from 2018 and they were soon realized. Combined with a poor offensive line, this big of a drop was still a little tough to see coming. For Williams, it was clear something was up when they added Hyde (before trading), Thompson, and McCoy in short order, foreshadowing the upcoming timeshare. Ingram signed a fat contract to be the featured back on a new team and it showed. Richard fell victim to the addition of rookie Josh Jacobs and already had a defined role as a satellite back.
So three out of four were rather predictable. Let’s keep this in mind as we look at next year’s projections.
2020 RB Model Projections
For our 2020 projections, I’ll use both the model projections and expert consensus rankings (ECR) from FantasyPros to breakdown the top values and highest risk players among the top-40 ECR RBs. The difference in value is the biggest key as that is what shows us how much better or worse the model thinks a player will perform versus consensus rankings. This breakdown is listed below:
Once you’ve had a chance to digest it, let’s go through the biggest takeaways. Also, as mentioned earlier, be wary of young, satellite backs. Take the projections for Tarik Cohen and Jamaal Williams with a grain of salt.
Austin Ekeler (LAC) – RB2 vs. RB11 ECR
Just like he was last year, Ekeler strikes again as an undervalued gem. Assuming Gordon signs elsewhere and they don’t use a high pick on a rookie RB, little stands in the way of him and fantasy dominance. The model likes his youth, touch count, receiving role, and just about everything else. RB2 may be rich, but the upside value is there.
Leonard Fournette (JAC) – RB5 vs. RB9 ECR
His receiving role spiked as he saw 100 targets last year. That’s what fantasy PPR dreams are made of. His workhorse role is unwavering and he performed well despite operating behind one of the poorer run-blocking lines of the year. He’s strongly due for an increased touchdown rate to top it off. 2020 is Fournette season.
Devin Singletary (BUF) – RB12 vs RB20 ECR
Singletary is in my situation to watch of the year. Will the Bills acquire RBs in free agency or the draft or will they completely hand the reins over? We saw his usage spike as the season came to an end and he outplayed Gore by a wide margin. There’s a lot yet to be determined for his situation, but he has one of the largest ranges of outcomes for next year, including a top-12 finish.
Derrius Guice (WAS) – RB18 vs RB29 ECR
His low ECR is clearly from an injury risk standpoint and Guice just hasn’t caught a break. His talent was overwhelmingly evident last year and the model loves him on a per-game basis. Let’s pray the new personnel and medical staff in Washington are what he needs to make through a year unscathed.
Other notables: Miles Sanders, James Conner, and Kareem Hunt
Aaron Jones (GB) – RB15 vs RB6 ECR
I love Aaron Jones the player. I’m timid on future Aaron Jones as a fantasy performer. You’ve heard the touchdown regression story a million times and he dismissed it all of 2019. If you are betting on Jones, you are betting heavy on a drastically increased workload which very well could happen, it just hasn’t for years on end. Players with over 6% touchdown per touch ratio regressed lower – usually significantly – the following season 80+% of the time over the last three years. Jones ended 2019 at 6.7%. Touchdowns are indeed volatile.
Joe Mixon (CIN) – RB19 vs RB8 ECR
I’m only writing this in honor of the model and to be objective because I am personally such a hardcore Mixon truther that this actually hurt to type out. The model may think he has some risks, but I believe in his talent. The offensive line should rebound with returning players and watching him play shows that he’s the real deal. Interpret him being listed in the risk section at your own discretion.
Todd Gurley (LAR) – RB26 vs RB13 ECR
Gurley’s career looked like it went permanently pear-shaped last season. It seems like he’s lost a step and the Rams offensive as a whole took a bleak turn in 2019. His name recognition and touchdown value could net you a decent return in some trade offers.
Mark Ingram (BAL) – RB41 vs RB19 ECR
Running backs over the age of 27 simply have a hard time hitting top-24 numbers. Over the last three years, only six RBs have done so, and four of those six happened in 2017, and two happened to be Ingram himself. Ingram will be 31 next year and while he’ll very likely outperform the model projection just like he did last year. Just know that their multiple factors working against him with age being the biggest. He’s a prime sell candidate considering his recent performance and you might not see another opportunity to do so for the remainder of his career.
Other notables: Kerryon Johnson, David Johnson, and Latavius Murray
Thanks for reading and stay golden! If you like what you learned, follow me @DavidZach16 for more interesting stats and analysis throughout the year.
Get the Edge – Join the #NERDHERD
- #Nerdherd Members Only - American Apparel T-Shirt 28.99$ – 32.99$
- Dynasty Nerds Logo Sticker 3.99$ – 4.49$
- #Nerdherd Members Only | Unisex Fleece Hoodie 39.99$ – 41.99$
- #Nerdherd Pennant - American Apparel 3/4 sleeve raglan shirt 34.99$ – 37.49$
- Dynasty Nerds Logo | Trucker Cap 24.50$