MASTER NOTES: The Common Denominator, Part 2

Last week in Master Notes, I raised the subject of standardizing pitcher metrics to use Total Batters Faced (TBF) as a denominator. Otherwise, they use a bunch of different denominators, which gets in the way of combining them. And combining them is really useful sometimes.

So I downloaded the Baseball Info Solutions (BIS) pitching stats for the last 10 years or so, and started the grueling task of converting as many stats as I could to TBF denominators. In the interests of brevity and consistency, I’ll refer to them all as “percentages,” to take semantic advantage the existing (and very useful) K% stat we already know. So every stat below is stated as occurrences of a particular outcome as a percentage of the pitcher’s TBF.

Besides K%, I also set up the spreadsheet to calculate TBF percentages for soft-, medium- and hard-hit grounders, flies and line drives. Then, I checked the Hit Rate (H%) chart I built last week…  

        Ground Balls   |    Fly Balls   |    Line Drives

       Sft   Med   Hrd |  S    M     H  |   S     M     H

=======================|================|=================

2016   14%   23%   51% |  8%   7%   49% |  66%   73%   64%

2015   15%   23%   52% |  8%   7%   48% |  64%   71%   65%

2014   14%   23%   53% |  7%   8%   45% |  64%   72%   66%

… and identified outcomes as either “Good” or “Bad” from the pitcher’s perspective.

I chose several outcomes as “Good” (I know—what a wordsmith!) because they tend to be minimally damaging to the pitcher’s ERA and WHIP because they prevent batters from becoming baserunners and baserunners from advancing to become runs:

  • Soft- and medium fly balls (FBSft%, FBMed%) have hit rates around 7-8% and we know from watching the game tend to prevent baserunners from advancing.
  • Soft grounders (GBSft%) have a 14% Hit Rate (H%) and also tend to prevent baserunners from advancing.
  • Infield fly percentage (IF%) have H% and advance rates of zero.
  • And K%, which ditto, and they count as a category unto themselves (the ultimate good outcome).

IF% is already published at various sites, but it is presented as infield fly balls as a percentage of fly balls, not of all batters. This struck me as potentially quite misleading, since a pitcher with fewer flyballs in general could show a higher IF% than a pitcher who gets way more IFs but also way more other kinds of flies, which can be an important consideration.

I also chose several outcomes as “Bad,” because they put runners on and/or advance runners (also, it’s the opposite of “Good,” which helps me keep it all straight in my mind):

  • Walks (BB%) and HBP (HBP%) put runners aboard.
  • Line Drives (LD%) have much higher hit rates than other trajectories, which means more runners aboard via hits and more runners advancing.
  • Hard-hit fly balls (FBHd%) are hits about half the time and well over-represented in extra-base hits, so batters get into scoring position right away, and will often push runners around, especially runners at third, who become runs even on outs.
  • And HR%, which, obviously.

Yes, I know there are some overlaps. IFs are obviously captured in fly balls, mostly of the soft variety, so they’re double-counted. I can live with that for now, because of the especially high goodness of the outcome on an IF. And yes, most HRs are also part of the hard-hit FB outcome, but again, a HR is a special kind of bad, so I’ll live with it for now.

Without getting too enmeshed into the details and numbers, it turned out that most established regular pitchers are pretty consistent year-to-year across the board in these outcomes. But it also turned out that many pitchers had a some “Good” categories and a few not-so-good or even “Bad” categories. So I took a next step and added all the Goods into a Good% and all the bad into a Bad%.

The Top-10% of pitchers in the “Good%” group had combined percentages over 50%. The list reads like the top of the Cheat Sheet: Scherzer (#1), Darvish, Kershaw, Bumgarner and Strasburg. But what’s intriguing is the names who aren’t obvious top-rounders, like Drew Smyly, Marco Estrada, Rich Hill, Eduardo Rodriguez, Drew Pomeranz and Matt Boyd.

Boyd makes an interesting case in point. His 20% K% in 2016 was just league-average. But his 8% FBSft% and 10% FBMed% meant he generated more cans o’ corn than the Jolly Green Giant.

His more defined TBF profile also makes his HR/F look even flukier than the 14% HR/F he rang up. Because remember—HR/F counts HR as a percentage of all fly balls, including the great many soft- and medium-hit FBs that could never be HRs. The true luck gauge should be the percentage of hard FBs that go yard. Gamewide in 2016, that number was 33%; Boyd’s number was 41%. Knock those eight percentage points off his HR/FBHd% and he’d have had three or four fewer HRs, with matching improvements in Strand Rate and, therefore, ERA.

After looking at these results for a while, I had a couple of realizations. First, “Hey, I don’t have to take out the garbage as long as I’m ‘working,’ ” and second, “Hey, if there are secretly good pitchers by TBF%, are there also secretly bad ones by Bad%?” We report, you decide: The worst 10% of Bad% starters last year included Jered Weaver, Aaron Blair, Jesse Hahn, Shelby Miller, James Shields, Phil Hughes and Jose Berrios. Keep those names in mind when you’re calculating likelihood of bouncebacks.

The best (lowest) 10% among the Bad% again included most of the big names (Kershaw was first) but also such chin-scratchers as Williams Perez, Jameson Taillon, James Paxton and Aaron Nola.

Finally, I thought, “If there are some pitchers who are good at being Good and others who are good at not being Bad, what about the ones who are good both ways?” So I subtracted every pitcher’s Bad% from his Good%. And whaddya know? Kershaw-Scherzer-Darvish-Strasburg-Verlander-Syndergaard-Bumgarner are all at the top of the Net Good% list, with scores of up to +31% (Kershaw). Corey Kluber and Kyle Hendricks also made the Top-10%.

But look who else is among the elite, with Net Good% scores at least double the gamewide level of +9%: Rich Hill (again), Drew Smyly (again), Drew Pomeranz (again), Steven Wright, Rick Porcello (whom everybody's looking for reasons not to draft) Nola (again), and Estrada (again).

The next level of refinement here could be to weight the categories somehow, to tease out the double-counting and to more accurately reflect actual contributions to success or failure. Maybe Ks would be fully counted in the numerator, with those cans o’ corn getting 0.92 weighting, GBSft% getting .85, and so on down.

But that’ll have to wait until next time.  Once again I’ve reached my 1,000 words. And I still haven’t taken the garbage out.

BONUS: Download a PDF with the various TBF% metrics discussed above here


Click here to subscribe

  For more information about the terms used in this article, see our Glossary Primer.