The Good, the Bad, and the Rock Fights: Revisiting Cal Football's Four Categories of Games

Or, is a pillowfight still an actual pillowfight?

Oct 03, 2024

What has substack’s AI image generator done to those bears’ faces?

Each week we use the previous game’s PFF grades to put it into one of four different categories: Good, Bad, Rockfight, or Pillowfight. This week we have no new game to examine, so what are we going to do? Time for some introspection!

These categories are generated by a machine learning clustering algorithm, and it’s up to us (well, mostly me) to decide what these categories mean. You see, the algorithm does not assign any specific meaning to any of the categories; it simply sorts through the data and puts games into coherent piles of similar games. By looking at how these statistics overlap and contrast across categories, we can interpret what each category means (hence the Good, Bad, Rockfights, and Pillowfights monikers).

These categories are not stationary over time, however. Longtime readers of this series may remember that we originally started with three categories: Good, Bad, and Rockfights. As we accumulated more and more games with highly rated offense and poorly rated defense, a new category emerged: Pillowfights! It has been a while since I have looked into the data to see if we have a new category coalescing. Diagnostics of the clustering algorithm show that four categories still adequately represent the data even as the boundaries of those categories tend to evolve over time. But that doesn’t mean things aren’t changing within the categories. As we accumulate more data, the composition of our categories may change over time, so Rockfights may become more or less offensively challenged over time, or the defensive criteria for inclusion into the Rockfights may become more or less strict. To ensure that our categories still represent what we think they represent, let’s look at what kinds of games belong to each category.

This review was inspired by a peculiar result that emerged after we added the PFF grades for the Florida State game. Earlier this year the wins over Auburn and SDSU were classified as Good, but now…

When we left off last week after adding Florida State, something unusual happened with the previous two games against SDSU and Auburn…

they’ve become pillowfights! Just above the center of the plot, they have jumped into the lower left side of the pillowfights. According to the k-mean clustering algorithm they are now Pillowfights even if the inter-ocular percussion test suggests they should have been Good or maybe even Rockfights (they certainly felt like Rockfights at times, didn’t they?).

With our worldview potentially collapsing upon itself, let’s look at what these categories currently represent. Is a pillowfight still a pillowfight?

Using our most recent set of clusters, I examined how much a cluster membership reflects strengths or weaknesses within a grade. In the plot below, each color-coded dot indicates whether a particular grade is higher than expected (above 0) or lower than expected (below 0) for that category. For example, the Overall grade is strongly biased upward for games in The Good and strongly biased downward in games in The Bad.

What effect does each category have on the PFF grade?

What does this tell us about each category? And does that align with how we have traditionally understood these categories?

The Good: grades are better than usual across the board in every category, and 7 of the 12 categories earn their highest grades here. It looks like The Good accurately captures the best overall performances.
The Bad: every grade is worse than usual (except run blocking) and 6 of 12 have their worst grades. The Bad accurately seems to capture some of the worst overall performances.
Rockfights: we have typically understood Rockfights to be all-defense, no-offense matchups. The patterns of the grades largely support this, as defensive categories excel, including 3 with their strongest scores. Offense mostly struggles but run blocking achieves normal grades and pass blocking achieves better-than-usual grades. Despite that, running and passing struggle mightily. So this adds a bit of nuance to our understanding of Rockfights. A game can qualify with some excellent defense and not-so-great offense, but it can also move towards the Rockfights with some decent blocking.
Pillowfights: opposite the Rockfights are Pillowfights, which have historically been all-offense, no-defense affairs. All defensive grades are worse than usual, including 3 that tend to achieve their worst grades (run defense, tackling, and coverage—which sounds like an awful combination of things to go wrong). The offensive grades are an interesting inverse from Rockfights. While Rockfights have decent blocking grades and terrible everything else on offense, Pillowfights have terrible blocking grades and solid offensive grades everywhere else.

Translating the abstract grade trends above into actual grades, the following chart shows the average grade across each game type for each category. This shows us the prototypical grades of a team belonging to each of the categories.

So how is it that the San Diego State and Auburn games became Pillowfights? It starts to make sense if we go through this category by category and compare the results to our average grades above.

Both Auburn and SDSU have decent grades for Offense, Passing, Receiving, and Running, which moves them towards The Good or Pillowfights. But their Pass Blocking and Run Blocking grades are bad, and much more closely alligned with Pillowfight blocking grades (55 for pass, 54 for run, on average) than Good blocking grades (71 for pass, 64 for run). So they both fit that odd Pillowfight profile where the offense excels despite O-line issues. Defensively, the grades for both games mostly fall in the mid-60s (except Tackling, which has been a serious problem this year). Defensive grades in The Good tend to be in the 70s, except for Pass Rush, which is usually in the mid-60s. Interestingly, those grades for Auburn and SDSU sit almost directly in the middle between typical grades in The Good and typical grades among Pillowfights. This likely explains why they sit just across the border from The Good; they’re defensively about 5 points short of Good but offensively they closely fit the unusual profile of a Pillowfight.

So what have we learned? Rockfights and Pillowfights continue to be all-defense-no-offense and all-offense-no-defense affairs, respectively, but they’re increasingly diverging based on O-line play. The O-line has an average-to-above-average performances in Rockfights and below-average performances in Pillowfights. For this reason we’re starting to see more and more games with good-but-not-great defense getting pulled into Pillowfights due to offensive line struggles. And as this team continues to struggle at O-line, we may continue to see more games that fall just short of Good thanks to O-line struggles. Such games obviously aren’t Bad and they have too much offensive success to be Rockfights. So they increasingly end up as Pillowfights and slowly start to change the meaning of what it means to be a Pillowfight.

Write For California

Discussion about this post

Ready for more?