Introducing Predict+
Measuring MLB and AAA pitcher unpredictability through machine learning analysis of pitch selection
Introduction
If you’re anything like me – which, if you’re reading an essay about a new pitching model, I have to assume you are – then you have a particular tic that comes out when you are watching a baseball game. It probably happens so often and so automatically that you fail to register it, thinking it’s something everyone does, but in fact it is not. I’m talking, of course, about guessing the next pitch that gets thrown. Whether you’re doing it out loud (and, like me, annoying your kids) or just in your head, guessing that next pitch is as much a part of the routine for baseball addicts as singing “Take Me Out to the Ballgame” in the middle of the seventh.
This sequence of events – and the October 2025 government shutdown that left me furloughed from my day job leading child nutrition research at USDA – led to an investigation of whether certain pitchers were more predictable in their selection in any given situation than others. And so Predict+ was born, a machine learning model that brings probability into the age-old ballpark guessing game and produces a metric that shows how unpredictable a pitcher’s selection is in a given situation.
Methods
The model learns statistical patterns in a pitcher’s selection under these circumstances over a set period of time. It then takes a test set – this could be an entire season, a certain number of days, a single game, – and does the same thing you or I do in the stands, which is guess at what pitch is likely to be thrown using the inputs available. Mathematically, it’s using what’s called a multinominal logistic regression model to do this. In a way, it’s modeling what opposition and self-scouting has been doing for over a century, looking for patterns and biases in pitchers’ selections.
In the test period, the model registers its “surprise” at each actual pitch selection compared to what the model predicted in that situation. It does this with two models. The first is a more basic model that relies on the pitcher’s overall tendencies, conditioned on count and batter handedness. The second brings in a lot of other data, including game situation and the number of times the pitcher has already faced the batter, as well as previous pitches in the sequence. In the aggregate, it spits out a metric that represents the overall surprise at a pitcher’s selection in a given situation by comparing the surprise registered with the more complex model versus the more basic one.
I chose to convert this along the lines of Stuff+, where 100 represents league average in the timeframe in question. Scores above 100 represent a more unpredictable selection of pitches, i.e. the model, trained on the pitcher’s tendencies, was below average in guessing the pitcher’s actual choices. Scores below 100 represent a more predictable selection of pitches, i.e. the model was better than average at guessing the pitcher’s actual choices. Each 10 points above or below represents a single standard deviation off the mean for that time period, so pitchers with Predict+ above 110 are very unpredictable while those below 90 can be guessed more easily.
Results
Results from the model for the 2025 regular season are available in a few formats:
A Tableau dashboard with the 2025 regular season Predict+ scores that allows filtering by league, team, and number of pitches thrown, as well as plotting against a range of outcome and process data from FanGraphs and Savant
Underlying data from the 2025 regular season for MLB and AAA in Excel format for your own analyses.
Here’s an example showing MLB starters in the 2025 regular season with Predict+ compared to whiff rates.
Discussion
A few things jumped out at me from this initial data run.
Position players are the most predictable. I didn’t expect position players like Kiké Hernandez to show up in the model since I limited it to those with 250 or more pitches, but I guess their teams were particularly nonchalant about letting them rack up pitches in blowouts. The good news is that they’re way down at the bottom of the metric, so they had the most predictable pitch selection by a sizable margin. This isn’t surprising, of course, but it’s a nice confirmation that the model is showing something that reflects the real world. If Austin Hedges showed up somewhere in the middle of the pack, I probably would have had to throw out the model. The results of statistical modeling aren’t always immediately intuitive – see the next bullet – but if your model’s saying that it’s statistically likely that grass is red and the sky is green then you know you have an issue.
Predict+ has some correlation with outcomes. I looked at the Pearson’s correlation coefficient between Predict+ and a number of other metrics, using the number of pitches in the test period as weights. In order to do this, I broke the sample into starters and relievers because the two are qualitatively different in what predictability means, and outcome metrics for relievers are subject to more noise than those of starters. There’s a decent correlation (a little over 0.1) between a higher Predict+ and a better xFIP, SIERA, strikeout rate, and induced whiff rate for starters.
Gimmicky pitchers are among the hardest to predict. Trevor Megill strictly throws two pitches. Tommy Kahnle throws a lot of changeups. But the model considers them two of the more unpredictable pitchers because they surprise the model when they don’t do what it expects. Being able to throw your limited arsenal in any situation is a different sort of unpredictability than having a wide arsenal, but it doesn’t make things easier for batters to guess ahead of time.
Predict+ can help explain how some pitchers succeed when you also consider stuff. Trevor Megill is a two-pitch pitcher who’s really hard to predict. Mason Miller is basically a two-pitch pitcher who’s easy to predict. Both are successful closers, but Megill’s success relies more on being able to switch his pitch selection between his two pitches in almost any situation (except when behind 3-0, which doesn’t happen much). Miller’s success relies on great stuff. He doesn’t care if you know what he’s going to do, he’s going to throw it by you anyways. Similarly, Dylan Cease is often cited for his lack of a viable third pitch, but he actually scores quite well in Predict+, suggesting that he varies his pitch selection in an unpredictable manner even if his arsenal is narrow.
Interpretation can be a little fraught for full season statistics. Pitchers who fiddle with their arsenal a lot are likely to have inflated Predict+ scores because they abandon old pitches or add new ones, which can fool the model. I consider this more a feature than a bug, since that fiddling does correspond to real-world unpredictability in some ways, but it’s important to consider this when interpreting the data. This is especially the case for minor leaguers, who are much more likely to change their pitch mix considerably over the course of a regular season.
Source code
If you’d like to fiddle with the R scripts or run your own, the source code is available for non-commercial use on GitHub.
The source code also contains a helper for R that scrapes the Statcast API for a given range of dates. The Sabrmetrics package, which I used as a basis, has functionality for MLB but I was able to convert this for AAA as well.
Further plans
I’d like to consider further refinement of the model, including adding more inputs to help determine pitch mix. I think there’s also the possibility of using the metric to evaluate catcher game-calling through looking at the effects individual catchers have on pitchers.

