RESEARCH: Sample size—When less is more

Introduction

In baseball forecasting, it is widely understood that more data is better when trying to model future performance. Today we examine that assumption for pitchers, and find that occasionally a smaller data set is actually better. We will also explore at what point the recent data becomes more significant than the historical data.

Methodology

We will use pitching data from 2010-2017, since 2010 is the season that Baseball Info Solutions began using an algorithm to classify quality-of-contact. For annual data, we’ll use pitchers with = 120 IP.  For monthly data, we use only data from pitchers with = 25 IP in that month.

Throughout the article we will use R2 as a measure of correlation between data sets. The R2 value describes...

Almost!

You’re just a few clicks away from accessing this feature and hundreds more throughout the year that have a singular goal in mind: Winning your league. Subscribe to BaseballHQ.com here!

Already a subscriber? Sign in here

More From R&A Studies

Outmaneuver your competitors by replacing watered-down ADP with detailed draft-level analytics for 12-team and 15-team mixed leagues
Mar 22 2024 12:55pm
Updating xSB in light of recent MLB rule changes.
Feb 9 2024 3:02am
The pitching landscape has shifted yet again, and our Pure Quality Start metric undergoes a minor shift to level-set the results.
Dec 20 2023 10:10am
2023 draft-level analytics from dozens of individual NFBC drafts to complete your last-mile preparation
Mar 26 2023 1:00pm
Unveiling six new eye-opening playing time metrics to help fantasy managers accumulate more "volume" by first understanding its sources.
Feb 16 2023 1:07am

Tools