Team talent and results: grading coaches 2015-2023

Submitted by Blue@LSU on February 3rd, 2024 at 4:23 PM

Have you ever wondered which coaches maximize their talent, which ones do more with less, or which coaches just can’t seem to win despite stockpiling a shit-ton of talent? Are you tired of arguing with OSU fans about whether Harbaugh or Day is the better coach, as if the answer wasn’t already obvious? Do you just like looking at graphs? If you answered ‘Yes’ to any of these questions, then this diary’s for you.

This is a continuation of a previous diary where I compared ‘team talent’ with performance in 2023 to see which teams got more with less or less with more. There was some great feedback and questions in the comments and, since I’m the type of guy that generally aims to please, I thought I’d take a crack at addressing a few of them.

By far, the most common question was what these data would look like over a longer period of time. WestQuad also started a thread after Saban’s retirement wondering what the data could say about his performance over time and whether he was, indeed, the GOAT. 

So my plan in this diary is to combine these two questions by (1) extending the data back to 2015 (the first year available for the 247 Team Talent Composite) and (2) using it to evaluate the success of P5 coaches.

Let’s get started, shall we?

THE MAIN VARIABLES

  • 247’s Team Talent Composite is a team-wide composite measure of every player's recruiting rating, accounting for transfers (in and out). I readily admit that this is not a perfect measure of ‘team talent’: recruiting rankings tend to be flawed, development matters, freshman stars are added even though they might not see the field, etc. But it’s the best measure we have, so I’ll be using it.
  • FEI’s Win Differential measures a team’s win differential relative to a hypothetical elite team (defined as a team with an FEI score that is 2 standard deviations above the FEI average). Positive values mean that the team had more wins than would be expected from an ‘elite’ team, negative values mean it had fewer wins. 

Putting this all together, here’s what the data look like for all P5 teams from 2015 to 2023. 2020 is excluded for obvious reasons.

  • The red shaded area represents ‘elite results’: teams that had at least as many wins as an ‘elite’ team would be expected to have against the same competition. 
  • ‘Elite recruiting is represented by the green shaded area (get it?): teams with a talent composite score that is 2 standard deviations above the P5 average for this time period. 
  • The dashed lines along the x- and y-axis are the mean talent composite scores and FEI win differentials over the sample period.

The plotted regression line shows the expected/predicted win differential at each corresponding level of team talent (thanks, RJWolvie for recommending this). I calculate actual vs. predicted performance for each year by subtracting the expected win differential (from the regression) from the actual win differential. 

  • Positive values indicate that the team had a higher win differential than would be expected based on its given talent. 
  • Negative values indicate that the team had a lower win differential than would be expected based on its given talent. 

The graphs for individual years are at the end of this diary. Here’s the graph with the average actual vs. predicted win differential for all P5 coaches from 2015-2023:

Oof. That’s ugly. Let me try that again. To conserve space, I’ll also drop coaches that only have one year of data for the 2015-2023 time period.

That’s a bit better. 

Saban comes in at #11, Harbaugh at #14, and Day at #20 (behind PJ Fleck). I personally wasn’t expecting to see coaches like Mike Gundy, Kirk Ferentz, or Paul Chryst in the top 10, but I guess it makes sense. Lane Kiffin at #9 was also a surprise.

It’s also kind of interesting to compare coaches as they moved from one school to another. Riley at Oklahoma (the good) vs. Riley at USC (the bad) seems about right. Mario Cristobal’s tenure at Oregon puts him within the top-30, but his time at Miami so far has earned him a spot in the bottom-5 of all coaches in the sample. Mike Leach, on the other hand, was pretty successful at both WSU (#4) and Mississippi State (#29). 

In case anyone is interested, here are the same data organized by conference.

Apologies for the messy presentation and the overlapping labels.

Anything you find interesting?

IMPORTANT CAVEATS

There is at least one thing to keep in mind about these data that limit their usefulness. The 247 Team Talent Composite only goes back to 2015, which unfortunately omits the early years of some coaches’ tenures. For example, Jimbo Fisher’s most successful seasons at FSU (2013 & 2014) are not included which probably decreases his average score (his time at FSU is currently in ‘The Bad’ category). Saban had a number of elite seasons at ‘Bama before 2015 (same with Meyer at OSU) that were excluded and may have weakened his overall score. Alternatively, Leach’s 3 worst seasons at WSU (3-9, 6-7, 3-9) are not included in the data. Only a small fraction of Bob Stoops’ 18-year tenure (1999-2016) or Kirk Ferentz’s 25-year tenure (1999-) are included in the sample. Etcetera, etcetera, etcetera.

YEARLY GRAPHS

In case you’re interested, here are the results for each year, 2015-2023.

 

 

 

 

 

 

 

 

Thanks for reading. Go Blue!

Comments

sports fan

February 3rd, 2024 at 9:18 PM ^

Of course it is much more complicated than this.  For example, the head coach is important of course, but I would suggest that in this study "Harbaugh" is a stand-in for the entire coaching staff.  I would suggest that Harbaugh did not have the "right" assistant coaches on staff until 2021.  That ultimately led to his success in 2021-2023.  I would suggest that there are three things that have to line up: 1) overall philosophy/strategy/objectives, 2) coaches and assistant coaches who can teach/design plays/motivate the implementation of No. 1, and 3) players who can execute the strategy, etc. outlined in No. 1.  Harbaugh finally had them all lined up beginning in 2021.  You don't have to have 5 star players to win, if the players buy into and execute that philosophy, and the coaches can coach in that system.

Think John Beilein.  He had his system, he recruited players to fit his system, and coaches who could contribute to making the system a success.  He was very successful without recruiting 5 stars all the time.

Teams like Ohio State can't necessarily be more successful than UM by recruiting more 5-atar players.

befuggled

February 3rd, 2024 at 10:12 PM ^

I love this.

I'd like to see Harbaugh and other coaches of interest (e.g., Day and Meyer) in a different color--although I'm not sure how easy/possible that is to do.

Amazinblu

February 4th, 2024 at 10:20 AM ^

Thanks for putting this together.  Great information.

I’m a believer that Michigan develops talent as well, or better, than any other team.

Is there a way to correlate the 247 player ranking and draft results?  Or, chart a player’s development during their time in college?

Blue@LSU

February 5th, 2024 at 6:54 PM ^

Thanks, Amazinblu!

I tried to do something similar in a diary last year. I didn't look at which schools the players went to, but this was what the data looked like in terms of recruiting rankings/draft:

Another poster, Rappin Randle, also had a diary where he looked at schools' recruiting rankings and draft picks. 

I also think that Michigan develops talent better than other teams. The problem is that the portal makes this much more difficult to analyze statistically. If a player stays at one school for all 3-5 years, it's easy to say they were developed (or not developed) at that school. But thinking about a player like Josaiah Stewart, for example. He was already good at Coastal Carolina, so it's hard to say how much he was developed at Michigan. 

HighBeta

February 4th, 2024 at 9:41 PM ^

This is, again, superb work that is well presented. Kudos!

Wandering through these, I get a sense that there is another variable (or two) that need to be explored for the sake of being a better prediction model: longevity.

If we can moderately agree that late teens players improve as they mature (for various physical, emotional, and intellectual reasons), would adding in the "age of the team" or "age of the starters" create a better/tighter scatter pattern in the ARIMA lines?

Or "maturity" of the coaching group, i.e., length of time the staff has worked together. My ancient brain tells me this is a minor predictor relative to starting player maturity, but I'm obligated to throw it at the wall to see what, if anything, sticks.

Perhaps you can create a class assignment and get some of your students to grind these out?

B@L? Again, superb work. Take a bow! And. Please consider further analyses?

Best Regards.

Blue@LSU

February 5th, 2024 at 6:25 PM ^

Thanks, HB!

It's funny that, whenever I post some of these analyses, it almost feels like I'm going through peer review. And I mean that in a good way. Y'all give great suggestions that often lead to new posts. So if you ever get sick of me posting diaries, just stop giving me ideas. 😊

You're right about incorporating some measure of team age (BYU?) and length of time that coaches have been together.

I was doing this more in terms of hypothesis testing to see the relationship between "talent" and outcome, and attributing the difference to coaching. But the problem is that there are a number of other variables that could account for the unexplained variation. The most appropriate way, as you suggest, would be to first build the model that best predicts success and then work from there.

Best,

HighBeta

February 5th, 2024 at 9:27 PM ^

Welcome! I will (try to) stop doing the "mental math" about ways to control for more variance in your models/numbers. Sorry, old habit from a prior career trajectory. BYU? Auto adding 24 months due to missions? Nice.

Fair warning: you keep creating diaries like these and you're going to get "hey, what about" comments, obviously not just from me.

And, you've done an excellent body of work representing the connections between talent and success. Repeat. Congrats on the excellent work.

 

Blue@LSU

February 5th, 2024 at 11:18 PM ^

Don't be sorry and please don't stop with the comments (I hope my reply above wasn't misunderstood). Like I said, it keeps giving me good ideas. I'm not as familiar with the CFB data as I am with the data I deal with at work so it's always great to get more suggestions. 

So please keep the suggestions coming! I appreciate all of them.

HighBeta

February 6th, 2024 at 2:18 AM ^

Ah, okay. I understand. You got it. 👍👍

If you've got some students who are looking to do some original research? Think about loading them up with some factor analysis work with a hope of forcing some good predictors to surface. Once you've got those, you can really start to do some nice curve fitting for win/lose predictions. It will be a challenge to account for weekly injuries, unique strengths versus unique weaknesses, etc. And your N will be quite low since you've only got 12 games with which to work. But, you might find a few variables that are decent to explore as significant predictors.

Have fun! Keep us posted, please? Thank you!

WestQuad

February 5th, 2024 at 11:18 AM ^

This is absolutely fantastic.  I am a Blue@LSU fan. It is too bad the team talent data only goes back to 2015, but this is great directionally. 

So the FEI win differential is the number of wins a team had relative to where a team with their talent level should lie on the linear regression.  So Washington is ~6 above the linear regression in this 2023 image and Michigan is ~4.3 above where they should be on the linear regression. That makes sense.  It is interesting to me that given Alabama, UGA and OSU's talent levels, which are significantly higher than most other teams that they are still above the line.  OSU had two losses and only really played two tough teams, PSU and Michigan (possibly ND).  Is the expectation of a team with that talent level to have 3 losses?  Oklahoma went 10-3 and is just about on the line.  Could you consider that line to be the expectation floor?  Michigan was expected to have 3.3 losses (~9.7 wins) and instead we had 15 wins and no losses.  (I'm off by one somehow.)  

I think this is an important concept.  When Harbaugh went 10-3 several times they were disappointing seasons, but they felt about right.  I forget if it was on here or on reddit, but people have been discussing what is the expectation for Sherrone More for next year?  Given our talent level, what does the regression line say?  4 losses?

Blue@LSU

February 5th, 2024 at 6:08 PM ^

Thanks, WestQuad. The feeling's mutual. Your question about Saban got me thinking about what these data could say about individual coaches.

It is interesting to me that given Alabama, UGA and OSU's talent levels, which are significantly higher than most other teams that they are still above the line. OSU had two losses and only really played two tough teams, PSU and Michigan (possibly ND).  Is the expectation of a team with that talent level to have 3 losses? 

One thing to keep in mind is that the model is not predicting how many losses a team will have, but how many loses a team will have compared to an "elite" FEI team in that year. The rub, however, is that the precise values of an "elite" FEI team (2sd above the average team in that year) change from year to year as do the levels of talent and how they are spread across CFB teams. 

Now when I was doing the analysis, I kept wondering whether I should use the regression line (and confidence intervals) for the entire period or if I should estimate a yearly-specific regression to account for these seasonal differences. I chose to use the regression for the entire period because estimates are just more precise with a larger sample. But if I ran a separate regression for each year, this is what the 2023 data look like:

In this case, we can't say that Alabama's, Georgia's, or OSU's win differentials are significantly higher (or lower) that what we would expect based on their level of talent alone because they fall within the confidence interval of the prediction. I think an argument could be made that this is the way I should have presented the yearly results.

In terms of what the model would predict for Sherrone next year, that's a good question. Right now we really can't say because we don't know where UM will fall on the talent composite. 

trueblueintexas

February 5th, 2024 at 1:35 PM ^

Maybe it is baked-in, but how is the difference in quantity of games played factored in? 

I.e. for the time period analyzed, Alabama averaged 12.77 wins and 1.33 losses for a total of 14.11 games played/season. Over the same period, Michigan averaged 9.88 wins and 2.77 losses for a total of 12.66 games played/season.  That is an average of 1.5 games played/season different. 

 

Blue@LSU

February 5th, 2024 at 6:31 PM ^

Damn, that's a good question and I hadn't thought about it. 

Part of me would say that the model is not predicting the number of wins/losses, but the number of wins/losses compared to what we would expect from an 'elite' team. In that sense, the number of games wouldn't really matter because it is standardized to a similar reference point (the hypothetical elite team).

The other part of me says that every extra game gives an additional opportunity to either win or lose that game. In that case, it would matter.

In short, I'm gonna have to give more thought to this question. Thanks for bringing it up. 

trueblueintexas

February 6th, 2024 at 1:08 PM ^

Thanks for the response. 

Under the old BCS model and the new playoff model, I would think the number of expected games played makes a difference. 

I.e. under the old BCS model, if you averaged the #2 team in talent for a season, the expectation should be that you are playing in 15 games that season (conference champ and two playoff games), however, the #4 team in talent would only be expected to play in 14 games (a conference champ, and only one playoff game), where as the #8 team in talent would be expected to also play in 14 games (conference champ and a bowl game), but the #11 team would only be expected to play in 13 games (12 regular season and one bowl). The difference in talent between #1 and #11 is only 10 spots but the difference in expected games played is 2. I think that shifts the value of expected wins. 

Sadly, I think this will become an exercise in showing how good Saban actually was. 

BTB grad

February 5th, 2024 at 10:15 PM ^

The Saban data point is super interesting. Even with all the talent he had, he was still significantly overperforming what would be expected of that talent level thru development & scheme.