Is Our O-Line Really That Young? Third Time is the Charm!!!

Submitted by Gameboy on

Just when I thought I was I out, they pull me back in!

I don't know why I am such glutton for punishment, but I am finding this topic interesting (and not just in football sense, but statistically as well). I want to contribute one last time.

Many people on the threads have pointed out that just counting the class experience (basically age) is not enough, you need to count the actual games started as well.

I agree, games started should be part of this analysis.

AmazinBlue pointed out that Phil Steele has published a convenient list of all the games started by the players on the roster before the season began (http://www.philsteele.com/Blogs/2013/JUN13/DBJune08.html). Since the data is so handy, I figured I would go ahead and combine both sets of data and make a handy dandy XY Scatter chart. X-axis is the total combined number of Class Experience (i.e. Frosh=1, rs Frosh = 1.5) and Y-axis is the total number of previous games started.

As you can see from above, Michigan is in a better place than at least four teams (Auburn, UCLA, LSU, and Texas Tech), and surprisingly not that far away from Alabama.

Statistically, Michigan is within one standard deviation from the mean on Total Games Previously Started and just .16 away from one standard deviation for Total Class Experience. That, by definition, says Michigan o-line is not an outlier.

Again, the data says Michigan o-line is young, but not "outlier" young. There are other teams in top 25 who are just as inexperienced and a few who are even in a worse position. Blaming all of our woes on o-line experience does not paint the entire picture.

Comments

In reply to by the unsilent m…

Yeoman

November 7th, 2013 at 9:21 AM ^

I think you've got this backwards and I think it's very much to the point.

Borges's predecessor at OC was something of a cult hero around here. Whoever followed him at OC was going to get a tough reception, especially if he tried to change the offensive scheme. It's exactly what most of us didn't want and people are going to pile on at any sign that it isn't working.

Webber's Pimp

November 7th, 2013 at 8:46 AM ^

Nice graphic. Two comments...

First, the results are skewed because we have two Senior prospects who have a llot of football under the belt. Those wto have  5 games remaining in their college careers barring injury. Once you remove Lewan and Schofield from the equation you realize that the interior OL is in fact outlier young. And as all of us have seen by now the interior line has struggled mightily. I agree with you that we have to take a look at player development on teh OLine. Clearly the interior line is behind the curve and perhaps Funk has something to do with it. Or maybe its our strength and conditioning program. We may need changes but make no mistake - the interior line is outlier young. 

For comparison's sake I think you'll find that while Alabama may be starting players that do not have many career starts, those linemen have been in the program for over two years now. In our case we're putting redshirt freshmen and true freshmen out on the field. It's makes a very big difference in terms of the on field result. O-Line is the most difficult position to play (much less master) as an incoming freshman. The physical demands of the postion coupled with the schematic (playbook and film study) requirements make it nearly impossible for any freshman to come in and dominate. The only Michigan lineman that I can remember doing it is Steve Huchinson and mind you Hutch started at OG as a redshirt freshman. 

You correctly point out that other teams that are just as young along the OL and have still managed to be in the Top 25. But are we that far off from being a top 25 team? Heck we've been in the polls 7 out of 9 weeks. Last weekend's game was a debacle and we've certainly struggled all season long. But we have 6 wins and we actually should have been undefeated going into last Saturday. I'm not saying we're world beaters. The team has regressed and this is the worst O-LIne I've seen at Michigan in over 25 years. But I don't think we're that far off from being a very competitive football team. 

 

Yeoman

November 7th, 2013 at 9:01 AM ^

is that more than half the teams are in one quadrant.

You've got a really skewed data set here. Standard arguments based on mean and standard deviation are going to lead to some not very useful results, especially with such a small sample size. There isn't a single team that's significantly (in the usual sense, which gameboy used in comments on the last thread)  below average on both measures, for example.

Space Coyote

November 7th, 2013 at 9:55 AM ^

21 of 26 teams are above average.

\argument

I said it before and I'll say it again. It's much easier to prove experience than it is inexperience, because no measure defines it well. That being said, I don't see how you can simply look at Michigan's line, even without the statistics above or in the 2nd diary (which are great, thank you for providing) and come to the conclusion that it's near normal. It is not near normal to start a FR, RS FR, and RS SO (that was a walk-on with one year of football in HS) and say, "that seams about right".

Profwoot

November 7th, 2013 at 11:20 AM ^

It's not subjective to say that Michigan's youngest 3 starters are younger (on average!) than those of any other team. That's the fact. Don't act like your simple-minded metrics are the only way to do things objectively, especially when you're so ornery about it.

Gameboy

November 7th, 2013 at 10:33 AM ^

mid·dle
  • [ mídd'l ]
  1. central and equidistant from limits: equidistant from the sides, edges, or ends of something
  2. being halfway between beginning and end: occurring or located halfway between the start and finish of a period of time, an event, or a series
  3. occupying intermediate position: situated in an intermediate position, e.g. in age or status

The data for class experience ranges from 11 to 19.5. I made the chart go from 10 to 20, with 15 as the middle [=((20-10)/2)+10]

The data for previous starts range from 27 to 97. I made the chart go from 20 to 100, with 60 (((100-20)/2)+20) as the middle.

I think both falls within the above definition. You are confusing middle with mean.

Yeoman

November 7th, 2013 at 10:43 AM ^

So "middle" is the mean of the two extremes (well, the extremes pushed to the next multiple of 10 in the direction of the extreme).

I'm not sure why any of this really matters, to be honest--what's relevant is why you think Michigan is essentially "normal" or "average" or whatever term you want to use. Is it that they're within a standard deviation or so of the "middle" as defined here?

Space Coyote

November 7th, 2013 at 11:04 AM ^

I understand that the middle of your plot doesn't necessarily mean average. But if you move the middle of the plot so that the average of all the data points is where the two axes intersect, you'll see that Michigan will be way down in the bottom quadrant. As is, you see a ton of teams that are in the upper quadrant, you see a group of teams that has experience in some ways and then doesn't in others, and then you see 3 teams that don't have much experience by either measure.

You keep brining up standard deviation into the mix, but what does that really mean here. You've assigned arbitrary values to a player of a certain class/eligibility. What is we multiple each number by ten, so that that FR = 10, RS FR = 15... so on and so forth. Then let's subract 10 from each number because your range is 10 to 45, not 0 to 45, the range of possible data is only 35. Well, this makes your standard deviation 4.5 rather than .45 as you might suspect. The variance becomes 20+.  And suddenly Michigan isn't 0.5 away from average, they are 5 away from average.

On top of that, the data is pretty clearly negatively skewed. You're approaching it like it's a non-skewed bell curve when it's not. Within the range of data, there are a significant number of teams above your average that are only averaged down to the mean by a few outliers, because the mean is of the data is so much higher than the mean of the range, you expect this.

Here's a decent representation of the data that I pulled from the internet

Michigan is on that line that gets very small towards the left of that picture, and they are on the left side for both ways you attempted to measure this data. So without even debating if simply taking a mean across the line is the best way to look at it (it's not, but I understand it's quick and simple), you can see that Michigan is very young compared to other groups. 

If you actually look at the data set and break it down into percentages, that are left or whatever the exact terminology is in statistics, Michigan is closer to 2 standard deviations away from the mean than they are to 1 standard deviation from the mean within the data set that actually exists (note: there are no teams that get a 1, and there are no teams that get a 4.5). So there is just a lot there that can be used to determine that, in fact, Michigan is on a very low end of the experience argument.

I think this analysis gives a quick and dirty way of kind of seeing where Michigan lands relative to others with a semi-reasonable method, and I do appreciate that. I just don't see how you pull from it that they are just barely below average when what I see is that they are on the fairly extreme end.

Gameboy

November 7th, 2013 at 11:34 AM ^

SC I do not believe you understand what standard deviation means. Standard deviation is not defined by lines in the graph. Standard deviation is an equation. You can calculate it yourself. I have already said that Michigan falls well within the standard deviation for number of previous game started. That is a FACT that does not change no matter where you draw the middle line, it has nothing to do with middle. I am regretting starting this whole thing as the lack of basic understanding of statistics on this board is disappointing.

Space Coyote

November 7th, 2013 at 11:43 AM ^

I know the equation, I even calculated what the standard deviation of the data was if you multiplied every artificial number you assigned to class by 10 (guess what, the standard deviation also multiples by 10 because you can pull the 10 out of the equation and move it out front). There are also rules for "normally distributed data". This data fits a negatively skewed bell curve fairly well. Now, within that data, there is an expected value, mu, if you will. Sigma then equals their distributions standard deviation divided by the square root of the number of random variables, yada, yada, yada.

A common way of discussing this is if the data is within however many standard deviations of mu. Typically, what it means is that +/- one standard deviation from mu is equal to approximately +/- 34% of the entire data set. Michigan does not lie in this area. Michigan is much closer to -2*sigma than they are to -1*sigma. So yes, I know what the equation is for standard deviation, I also know it can easily be manipulated simply by spreading out your scale. I also know what "within n standard deviations to the expected value means. Michigan is well outside of that.

Space Coyote

November 7th, 2013 at 11:59 AM ^

You multiply each number of yours by 10. This means the difference between each data point and the mean will also be multiplied by 10. That ten can be pulled out of both your data points and the mean, and then squared as seen above. But that square is then square rooted, so you can pull out a 10. Therefore, by multiplying each of your arbitrary numbers by 10, your standard deviation is also multiplied by 10. 

If you don't believe my math, here is a link to a standard deviation calculator. Multiply each of your arbitrary data points by 10. So a FR = 10, RS FR = 15, SO = 20... It's the same linear trend you're using, so it's about equivalent in how reasonable it is. Now the standard deviation, lo and behold, is also multiplied by 10. Now the standard deviation of the data set is 4.522.

 

Gameboy

November 7th, 2013 at 12:10 PM ^

That is because you are calculating the standard deviation of the scale and not the actual data. I am unfamiliar with what usefulness you get from calculating the standard deviation of the scale. If you multiply the actual data by ten and calculate the standard deviation, I believe you will find that we are still within one standard deviation (aka sigma) away from mean on previous game played and just outside it for class experience (the data you do not like).

Space Coyote

November 7th, 2013 at 12:21 PM ^

Then I'll admit to having the type of standard deviation wrong, it's been a long time since I took my statistics class.

Does the fact that this data only, in fact, goes from 2.2 to 3.9 rather than from 1 to 4.5 have any affect on it though. I mean, I guess a team could score a 1, but it's just not going to happen, so shouldn't/can't that some how be taken into account as well. It just seems to me, when I think back a long time ago, and see the negative skewed plot I put above, I see the 2.2 at the axis and I see Michigan more towards that small area beneath the curve that puts it n-standard deviations from the mean fi you are just looking at bulk percentages.

Gameboy

November 7th, 2013 at 10:18 AM ^

Now you are blaming me for how data is spread??? This is an unwinnable argument for me as if I use mean as the middle, you are going to point out how teams are bunched in the middle.

Whatever.

Yeoman

November 7th, 2013 at 10:49 AM ^

The technical details of where you drew the line are irrelevant, within reason anyway. If you use anything like a mean, which you sort of have done, most of the teams are going to fall into the above-average quadrant. It's an objective feature of your dataset that I, and others, are trying to point out. The distribution of points in your scatterplot is skewed.

That you're taking these comments personally is a pretty clear sign that you drew your conclusions first, then went after the data.

Gameboy

November 7th, 2013 at 10:54 AM ^

There is no such a thing as "sorta mean". It is either a mean or not. I already said my middle is not the mean.

I am just frustrated because you keep pointing out flaws, or at least what you conceive to be a flaw without being helpful. I have spent significant amount of my free time to address these flaws. I get very little acknowledgement from you on that. If that is not enough, and you don't like how it is done, then create your own diary and charts.

GoBlueInNYC

November 7th, 2013 at 10:53 AM ^

I'm not sure how you can reasonably accuse him of making his conclusion first than going after the data when he did what is the most obvious and simplest means of answering a pretty basic question. Are there teams that have lines that are as young or younger or as inexperienced or more inexperienced that UM? He took the average class and starting experience of a set of other teams. There isn't any cherry picking or weird data manipulation going on; it's literally as transparent, simple, and straightforward as it could get.

Yeoman

November 7th, 2013 at 11:26 AM ^

I guess it's that I point out an objective feature of the data (the distribution is such that most teams are above average) and it's received as an accusation. It's such a misreading of what I've written that I wonder what's up.

As to the main point: yes, if the point of the exercise is to find out if there are any successful teams out there with similar experience issues, it was a simple and straightforward effort that I've applauded more than once. If I hadn't thought it was worthwhile I wouldn't have wasted my own time spotchecking the data and finding out the Rivals database is bad-to-worthless.

It confirmed somethine we'd worked out on other threads, that UCLA has an enormous problem similar to Michigan's only more so. It added Auburn and LSU as young lines--Auburn's a different situation because four are returning starters, LSU as the one real success story.

That was great, and it wasn't cherrypicking because the point was to identify successful teams. It only becomes cherrypicking when people try to use it for something it wasn't designed for. If you want to know how Michigan compares to other teams with youth and inexperience in their line you have to compare them to all such teams*, not just teams that have been successful and are in the top 25. All we're getting here is that Michigan isn't as successful as teams that are more successful. That's not very useful.

 

 

*Well, not all such teams necessarily. You probably want to compare Michigan to its peers, not FAU. Maybe you look at all teams in AQ conferences, and maybe leave the American out of that. Or you look at teams that have finished in the top 25 at least once in the last decade or two. But whatever you use, the criterion can't be based on this year's performance.

Space Coyote

November 7th, 2013 at 12:15 PM ^

Move the middle of the plot to the average. I bet it will just move Michigan farther down on the chart, making them look more like an outlier and more inexperienced than your chart currently does. That would account for some of the skew in the data though, at least to marginal degree.

Again, like you said, it doesn't really matter what the middle of your plot is. It's marginally misleading because it completely disregards any skew in the data by making Michigan look closer to the middle than they are to the average. I'm not asking you to fix it, because I don't really think it's necessary. But if someone just pulled up that plot, gave it a quick glance, they'd say "Michigan's close to the middle" rather than "Michigan's pretty far away from that large chunk of teams or from anywhere near what could be considered above average experience from either measure".

I know you spent a lot of your free time working on this, and again, I appreciate it. All statistics are flawed and certainly can be twisted to fit an agenda. As my statistics prof once said, "stats are like a girl wearing a bikini, it's not what you see that's all that interesting, it's what's underneath." Well, we're looking at the same girl wearing different bikinis and both trying to figure out what's underneath.

m83econ

November 7th, 2013 at 9:19 AM ^

First of all, a reshirt frosh is much more valuable, assuming other factors being equal, than a frosh.

 

Second, actual game experience matters.  A redshirt freshman is still someone who hasn't played a down in a an actual game.

 

Third, isn't this horse dead already?  Is there a need to keep this going?  The line is young, woefully inexperienced, but will hopefully improve.  That's all we have at this point.

Red is Blue

November 7th, 2013 at 9:51 AM ^

We've got our youth and blocking struggles all bunched together in the middle (so you can't really roll out away from the weakness). Combine that with TEs who are young themselves and struggle with blocking and the fact that we apparently don't have a running back on the roster that can pick up a blitz and you've got a recipe for doom. It is not just an age thing, but that doesn't help.

reshp1

November 7th, 2013 at 10:24 AM ^

If you assume age and experience reduces the chances a player will get beat or bust and assignment, you can multiply the individual odds together to get the collective likelihood your OL gets beat or messes something up. This accounts for the fact that you can't just average experience (a SR playing next to a FR doesn't make two SO or JRs)

I used these odds

A FR will successfully block his man only 60% of the time (probably conservative given UFR scores)

A RS FR will be successful 65%

SO or RS SO = 80%

JR = 85%

>JR = 90%

Multiply them together and our line is likely to bust an assignment or get beat 75% of the time. (1-(0.9*0.6*0.8*0.65*0.9))=0.75.

ULCA is the only team close at 74%. Everyone else is below 70% and 2/3s or the teams are 60%-40%

Data table is here:

http://mgoblog.com/diaries/lets-try-again-our-o-line-really-young#comme…

reshp1

November 7th, 2013 at 11:02 AM ^

Yeah, I did, but it's no different than assigning a number to a year and averaging. We're doing some fairly back of the envelop type stuff here to get a feel for experience, so yeah, it's not very rigorous. My thought process was there's only incremental difference between FR and RS FR, the redshirt guy gets more time in the practice and weight room, but neither have experience. The quantum leap occurs in the SO/RS SO year because presumably if you're starting you've gotten some PT prior (not always the case) and 2 years seems to be a generally agreed number for a OL guy to be considered developed. You get incremental increases after that with each year.

I'm completely open to using different numbers, it's all in a spreadsheet so it's easy to change.

 

GoBlueInNYC

November 7th, 2013 at 11:09 AM ^

Just wondering where the numbers come from. I'm sure there's a much more rigorous way of creating weights to apply to the age/experience numbers, but that sounds like a lot of work.

As to assigning numbers to years as arbitrary, I'd say that that's not true. It might not be a good metric, but it's pretty clear that the difference between two numbers directly corresponds to the difference in years.

Like I said, you seem to be the only one to back-up the "the average is bad" talk with an actual alternative, which I appreciate.

reshp1

November 7th, 2013 at 11:17 AM ^

Just wondering where the numbers come from.

Feelingsball, to borrow a term from Brian. I did play around with the numbers a bit just to get a feel for how different ratings would skew the results and it really doesn't matter too much. As you say, the point is to show that averaging doesn't account for the weaklest link nature of OL, where this method inherently tends to make experience at some positions not enough to overcome youth at others.

Gameboy

November 7th, 2013 at 10:50 AM ^

I think this does have the potential to be superior to what I have presented so far,

However, for this to be of any use, you cannot just pull percentages out of air. The date you have here is basically fantasy. And you can't just base it on Michigan stats only since it could just be that our guys are really terrible and out of norm. You would have to figure out the exact percentages of success for each class by doing UFR on all 25 teams and their games.

I am not going to attempt such endeavor. But I would highly encourage you or anyone else to do it.

reshp1

November 7th, 2013 at 11:10 AM ^

Agree it's a little hand waving. I wrote out my rationale for the numbers in the reply to the post above (saw his first) if you're interested. (EDIT: by the way, I played around with the numbers a little and it doesn't make too much difference. Even if you assume a linear progression every year, we're still on the top end of "bust %" with this methodology because it inherently punishes having young guys more and rewards having experienced guys less)

coastal blue

November 7th, 2013 at 2:03 PM ^

Okay, but...

Shouldn't guys like Kalis and Magnuson RSFR who were supposedly two of the best players in the country at their position coming into college, be better than say true sophomores or RSSO who were not elite prospects?

 

Muttley

November 7th, 2013 at 11:24 AM ^

e.g., at Alabama, an all-league caliber senior may not have seen the field until his junior or senior year.  That's a good thing for the program, as it's a result of enviable depth.

CarrIsMyHomeboy

November 7th, 2013 at 11:25 AM ^

I would like to know the percentile in which Michigan falls nationwide for having a depth chart with 18.125% upperclass constitution:

10 freshmen

3 sophomores

1 junior

2 seniors

 

/I'm a broken record too

/but my record seems more broken

akim

November 7th, 2013 at 11:31 AM ^

I think your data is interesting and I think that you are bringing up some good data representing what we have.

The argument some people are making is that having two 5th year seniors at tackle is "skewing" the average.  I agree with your statement that the average is the average and it's not really up for debate - I think better arguments would be made that there are diminishing returns on years of experience, and that the idea that the line is really only as good as its weakest link.  

After 1 year on the field a lineman makes a big jump in experience and knowledge, however the difference between a 4th year lineman and a 5th year lineman is probably in the grand scheme of things not that much.

On the second point, you only need 1 guy to break through to the quarterback for bad things to happen, and 2 pretty much leads to a sack unless you are Denard Robinson.  While the tackles stop the DEs on the side if you get 1-2 guys up the middle, it's probably a lost cause.

I'm not saying you need to redo this data - you've put together good data that supports a point you are trying to make, and I wouldn't really know how to quantify the diminishing returns on experience (and it doesn't really help that there's no good measureable stat for an individual offensive lineman's progress)

Also, I don't think that because people aren't proposing an alternative it means that they are wrong, and you're not really helping yourself with the snarkiness in your responses.  I thought the original posts were fine and if people want to expand and draw their own conclusions you provided the data you had as well.