A week later, after the whir of fax machine printing has subsided and relegated that particular bit of archaic technology back to 364 straight days of irrelevance, the cacophonous roar of drive-by "analysis" rushes from one major sporting news event to the next and all eyes turn to the doldrums of February -a month devoid or anything and everything resembling football related activity- we're left to ponder just what occurred on National Signing Day and what it means for our favorite team(s).
If you're a causal-to-rabid follower of college football recruiting at all, you've no doubt seen the following sentiment peppered across the Internet:
The people who most often find recruiting rankings irrelevant are the ones who's favorite team have a bad recruiting ranking. #SigningDay— Tim Watts (@TimWatts_BOL) February 5, 2013
This type of comment is not at all surprising given the particular allegiance of this gentlemen, a fellow whose vested interests include selling and maintaining a subscriber base, among other things. It's a rather convenient statement for a man involved in a cottage industry adjunct to Alabama football (or any other helmet school, for that matter), since programs with many built-in advantages like the Tide generally don't have to worry about bad rankings.
Still, this mentality trickles down from the top programs in college football to the rest of us, at which point the Church of Star Ratings and the associated high priests of the various recruiting networks disseminate the gospel: to win, you must recruit well; your recruit ranking has a direct relationship with the results on the field. Given that, it's no surprise to read the following passed off as insightful commentary:
Gerry DiNardo: General impressions are that Ohio State and Michigan upgraded themselves pretty significantly, with Nebraska a little bit of a distant third. I think Minnesota and Purdue struggled the most, and everyone in between had about the same classes they've had the last six years or so, since I've been following it.
Myrick and running back Berkley Edwards provide a nice boost of athleticism, as does versatile prospect Donovahn Jones. Three-star outside linebacker Rayfield Dixon is a rangy athlete who should develop into a productive edge rusher. Junior college inside linebacker Damien Wilson will provide immediate depth at the position. Under Armour All-American Ryan Santoso is the seventh-ranked place-kicker nationally and easily the biggest as he measures 6-5 and 270 pounds.
ESPNU subsequently graded the Gophers' class at C-, and rated them 75th in the nation -- behind 9 non-BCS conference programs.
On first glance, the Gophers' newest class has not won over the major recruiting sites. Rivals.com and Scout.com ranked the team's latest additions as last in the Big Ten. Rivals has the Gophers slotted at No. 61 out of 125 FBS teams, while Scout has them ranked 77th overall. Based on Rivals' ratings, Minnesota and Iowa are the only teams in the conference without a four-star or three-star commit.
Wednesday is college football's national signing day, and for the second consecutive year, the Gophers have the lowest-ranked class in the Big Ten, according to Rivals.com.
Still, analysts following coach Jerry Kill on the recruiting trail see signs of progress beyond the team's improvement from 3-9 two years ago to 6-7 last season.
Judging recruiting classes that aren’t among the top 60 in the country is extremely difficult because it’s almost impossible to figure out which two-star or unrated prospects could end up making an impact. Many members of Gophers coach Jerry Kill’s 2013 class just haven’t been scouted.
The rankings prove it to be true, but do you think the Gophers are recruiting at a significantly lower level under Jerry Kill than Tim Brewster?
But can the Gophers ever expect to land some five-star and four-star recruits or are they just going to be relying on developing three-star guys and below under Kill?
I have to ask again if there’s any chance that the Gophers could break into the top half of the Big Ten in recruiting in the coming years?
Yes, it is a factual statement that, according to the four major Internet recruiting services, Minnesota's recruiting class ranks last in the Big Ten. What's less understood, however, is what that's supposed to mean. The reason behind this is simple: little real analysis has been produced to show any significant determination or predictive value for the ratings associated with the big recruiting services (no, this doesn't count as real).
Luckily, Winthrop Intelligence did conduct such an analysis, and what it found was rather surprising... if you're a recruitnik:
To analyze this question, we focused on the last five seasons, 2006-2010, giving us nearly 600 teams and over 11,000 players to analyze. Looking at that group as a whole, we found there’s no direct relationship between the amount of 3+star players on a team and overall winning percentage.
However, when we review the same data but control for strength of schedule and strength of conference by comparing teams within each conference, we found that more star players on a team usually means that team will perform better in the conference than teams in the same conference with less star players. Specifically, 25 times out of 55 (or 45% of the time), the team with the most starred players in the conference ended up with the best or second-best conference winning percentage. Likewise, the team with the least amount of starred players in the conference finished in the bottom half of conference winning percentage 33 times out of 55 (60% of the time). Some conferences showed more of a tendency for the amount of star players to predict the conference rankings (the SEC and WAC), but not consistently and not all that significantly.
That's a start, though the conclusions of the last two sentences are dubious at first glance for those without a background in statistics. If you apply the concepts of conditional probability and assume there's an equal chance a program in a conference of 12 will finish with the most star players as they are the least (not a good assumption* in practice, but stick with me here), the chance a team will finish first or second in the conference with the best recruiting is 3.8%. The chance a team will finish in the top half of the conference, despite having the least amount of stars? 3.3%.
I decided to take Winthrop's methodology even further, making a few adjustments** to account for attrition (among other things) and create a predictive model to determine whether star power can accurately select wins or loss. In building these models, I placed teams in one of 32 buckets to represent talent levels according to Rivals.com ratings. As it turns out, the different models*** had an error rate between 30-35%, meaning it predicted the correct response (win or loss) based solely on home and visiting team star rating, plus whether the game took place in the home team's stadium or on a neutral field, roughly 7 out of 10 times.
That sounds like a confirmation of the star system on the surface but I'm not so sure. If you take teams from the top five buckets (i.e. helmet schools****) against teams in buckets 6 through 32 and simply picked them to win every time, you'd hit on 70%; the bottom five buckets lose 60% of their games to programs with better ratings. However, the teams with that kind of star power (or lack thereof) are well known to the general public, regardless of the statistical index behind them; an individual in a pick 'em league would have the same accuracy in selecting winners based upon program prestige as these mathematical models would.
All of this produces the inevitable chicken vs. egg debate: are the top programs good because they get the best recruits, or do the best recruits simply gravitate towards the top programs? As a corollary, if top coaching is attracted to top talent and top talent is attracted to top coaching, which is the determinant? In essence, college football is a self-sustaining positive feedback loop: the rich get richer, on the field success propels recruiting prowess which produces more wins in the ledger. When the system is reversed, bad results lead to poor recruiting efforts which ultimately gets a coach fired. Depending upon the status of your favorite program, this can be fixed by hiring a non-underachiever to unlock the potential of your existing talent base to get you back on track relatively quickly.
In other situations (i.e. schools that don't possess structural advantages), the road to climb is much more difficult. A coach rebuilding a program will invariably need to improve the talent base, though it's unclear whether that will manifest itself first on the gridiron or on the recruiting trail. Frankly, the Internet recruiting era hasn't been around long enough to establish a control group of programs that have legitimately built themselves up either way: the sample size and populations are simply too small to define quantitatively.
Where does that leave us? Recruiting is crucial to the success of your program, otherwise coaches and staffs wouldn't go through the trouble each February of heralding the next crop of incoming players, stating how Johnny X is a great competitor or we've upgraded our team today with such pageantry that often inspires socials and a celebration of hopeful future prosperity. That's ultimately what recruiting and all the recruitnik zealotry is based upon: hope that these kids will contribute to turning the program around, winning a conference or national championship or wiping the floor with your bitter rivals for a half-decade, whichever the prerogative of your favorite team.
From a tactical and analytical perspective though, recruit ratings often miss the mark -- and in some cases, badly. As a much better writer than I stated regarding the recruiting dog and pony show:
A big part of the problem with recruit rankings as a predictor for future college football success is there's simply not enough data: team ratings on both Rivals and Scout only date back to 2002, which in turn limits the amount of true roster analysis out a minimum of four years given the nature of the sport. That's only seven roster cycles across 120+ teams of disparate talent levels and a lack of connectivity due to conference play.
The other problem, similar to hockey, is just how much of football isn't measured with any kind of readily available statistic. Unlike baseball, where virtually everything that happens on the field is accounted for down to the per-pitch level and sample sizes are huge, football -specially college football- doesn't have the depth and breadth of data collection that's required to make a truly accurate predictive method. Whereas the effectiveness of the suicide squeeze or a double-switch in late innings can not only be inferred but measured directly, deriving the strength of offensive line push, a defense line's ability to shed blocks or whether a pass was the result of play-action (and if that matters from a data mining perspective) is not feasibly given current data collection methods.
All of which begs the question: if we haven't reached a point where the available data can clearly illustrate those most fundamental but inherently implicit aspects of football, how on Earth can we predict incoming recruit success with any accuracy two-five years out?
The answer: you really can't, at least not definitively to the point where one could suggest an X increase in star rating will result in Y level of winning percentage. It simply doesn't work that way. And that is the Catch-22 of recruiting. Everyone agrees, including coaches, that you need better players in order to win. What no one seems to agree upon, other than the big recruiting services and their stakeholders, is just how quantifiable recruiting is and whether the major players in that field even come close to getting it objectively correct. Virtually every college football blogger who's involved in advanced and innovative statistics uses recruiting as a predictor variable (myself included), though the weight folks give on such ratings differs.
Personally, I favor on the field production over potential. That is apparently a foreign concept to some who follow "the word."
So, what are the takeaways from this discussion?
- if you follow a school in position to feast on a bounty of four and five stars, recruiting probably matters to you a whole hell of a lot, and the ratings/rankings do account for something.
- if you follow a team that consistently has the least amount of roster talent compared to the rest of the conference, said team needs to work on improving that.
- for everyone else, recruiting ratings/rankings are neither deterministic nor definitive. These type of schools likely can't recruit like a helmet school, and shouldn't try to either.
- rosters are cumulative, yet recruit ratings/rankings don't take that into account and rarely is that part of the discussion.
In Part 2, I'll expand on these thoughts relative to the Gophers' 2013 class and what exactly we're supposed to glean from the latest crop of Gopher recruits.
Appendix (for nerds only)
* It's not a good assumption in reality because only a few teams can realistically vie for that much talent consistently. In the Big Ten, for example, only Ohio State and Michigan have rated as the most talented rosters in the conference from 2006-2012. That's a 22.5% conditional probability that either Ohio State or Michigan will finish first or second in any given year. It just so happens to have occurred every year since 2006 except one.
One the flip side, Indiana finished with the lowest star power in the B1G every year of measure.
** My methodology is based upon a stat I derived called the "star index," which takes three things into consideration: 1) the number of 3+ star recruits on a roster over a 5 year period, according to total Rivals signees 2) the "minimum attrition rate" of a program by taking the number of signees over that same 5 year period minus 85, divided by the total number of signees and 3) the program's attrition adjusted 3+ star recruits indexed against the mean and standard deviation of all FBS programs.
*** Using a process called "k-means binning," I transformed the star index of both the visiting and away teams from discrete to categorical variables, which produced 32 separate bins. Then, I used random forest, logistic regression and neural network models to produce a game result scoring file.
From a validation/accuracy perspective, the error rate of the 3 models ranges between 29.5 and 34.1%, while the area under the ROC curve is between .709 and .772. Generally, those type of area under the ROC curves are acceptable for real world purposes -- especially as it gets closer to 1.
**** Specific teams are Alabama, Auburn, Florida Sate, Florida, Georgia, LSU, Miami, Michigan, Notre Dame, Ohio State, Oklahoma, USC, Tennessee and Texas.