MLB Sabermetrics for Betting: FIP, WHIP, wOBA and the Stats That Move Lines

Sabermetric stats dashboard for MLB betting featuring FIP, WHIP and wOBA pitcher and hitter metrics
Table of Contents
  1. The night a 2.10 ERA cost me three units
  2. Why sabermetrics beat traditional stats
  3. FIP explained
  4. WHIP explained
  5. wOBA and offensive metrics
  6. Expected stats — xERA, xFIP, xwOBA
  7. Bullpen metrics
  8. Applying sabermetrics to a line
  9. Where UK punters find the data
  10. FAQ

The night a 2.10 ERA cost me three units

I learned the value of mlb sabermetrics for betting the night I backed a pitcher with a 2.10 ERA against a team I “should” have faded — and watched him give up seven runs in three innings. His ERA was a mirage. His FIP told the real story, but I had ignored it. That game cost me three units and taught me a lesson eleven years of pitching analysis has only deepened — traditional stats are a lagging indicator, and the closing line moves on the underlying numbers long before the public catches on.

Sabermetric stats are the data the market is actually pricing. A pitcher’s W-L record is meaningless — wins depend on bullpen support and offence behind him, neither of which he controls. ERA is better but still distorted by sequencing, defensive quality and ballpark. A batting average of .280 conceals whether a hitter walks 80 times or 25 times, and whether his contact is hard or soft. Sabermetric metrics — FIP, WHIP, wOBA, xFIP, xwOBA — strip out the noise and give you the true talent level driving the price.

This article covers the metrics that genuinely move closing lines, not every acronym FanGraphs lists. I will walk through the pitcher metrics first because pitching is the dominant variable in MLB price formation, then the hitter and offence metrics, then the expected-stats family that adjusts for luck, then bullpen-specific measures. Each section closes with the threshold values I use as cut-offs when I am scoring a matchup before placing a bet.

Why sabermetrics beat traditional stats

Consider this — in 2025, only 7 hitters across all of Major League Baseball finished the season with a batting average of .300 or higher. The decade-of-the-2010s average was 22.1 such hitters. In the 2000s it was 39.7. The league average AVG in 2025 sat at .245. That decline is not because hitters got worse. The strike zone, defensive shifts and pitcher quality have all shifted the run-scoring environment, and batting average has decayed as a useful signal in lockstep.

If batting average no longer reliably separates good hitters from bad, what does? On-base percentage, slugging, walk rate, strikeout rate, hard-hit rate, exit velocity, barrel rate — those are the inputs that survive the modern context. Most of them feed into wOBA, which I will get to in two sections.

The pitching side is even starker. A 2022 analytics study of the 2021 MLB season ranked the top ten team stats by correlation with winning percentage. Eight of the top ten were pitching stats — ERA, FIP, LOB%, WAR, WHIP, hits-per-nine, batting average against, and saves. The only pure offensive stat in the top ten was offensive WAR, and it ranked ninth. Pitching is not just the dominant variable in MLB. It is the variable. Everything else is a rounding error on top of pitching quality.

This is why I lead every matchup analysis with the two starting pitchers’ FIP and WHIP, then their last three outings, then bullpen ERA, then offence. The order matters. If you run that workflow from the bottom up — starting with which team has the better lineup — you will end up backing the wrong side three or four times out of ten and not understand why.

The market knows this. Closing lines in MLB move primarily on pitcher news — a starter scratched, a starter struggling in warm-ups, a swap in the rotation. They move secondarily on weather. Offence is the slowest input to move a price, because lineups vary game to game and the day-to-day signal is noisy. If you want to beat the closing line, you have to be where the closing line is going — and the closing line is going to wherever the pitching data points.

FIP explained

Fielding Independent Pitching — FIP — is the single most useful pitcher metric for betting purposes. The premise is simple but the implication is profound. ERA measures runs that crossed the plate. Those runs depend on defence, sequencing, and luck on balls in play, none of which the pitcher controls. FIP strips all of that out and asks — what did the pitcher actually do? Strikeouts, walks, hit-by-pitches, home runs. The four outcomes the pitcher fully owns.

The formula is straightforward — (13 × HR + 3 × (BB + HBP) – 2 × K) / IP + a constant that calibrates FIP to the league-average ERA each season. The constant is published every year, usually around 3.10 or 3.20. The result is on the ERA scale — so a pitcher with a 3.10 FIP is putting up Cy Young-quality work, and a pitcher with a 5.50 FIP is below replacement.

The thresholds I use as cut-offs — under 3.00 is elite, Cy Young contender territory. 3.00 to 3.50 is excellent, a top-of-rotation starter. 3.50 to 4.10 is solid, a middle-rotation arm. 4.10 to 4.50 is the league average, a back-end starter. Over 4.50 is below average, and you should be looking at the underdog or the Over depending on the matchup. Over 5.00 is a pitcher the market should be punishing, and if the line does not reflect that, you have found value.

FIP is sensitive to home-park context. A pitcher in Coors Field will run a higher FIP than the same pitcher in Petco Park because of the home-run component — the altitude inflates HR totals. The xFIP variant adjusts for that, which I will cover in the expected-stats section. For day-to-day betting, raw FIP is the workhorse stat. xFIP is a check you run when you suspect home-park noise is distorting the picture.

FIP vs ERA — when they diverge

The most useful information FIP provides is its gap from ERA. When a pitcher’s ERA is significantly lower than his FIP, he has been lucky — runs that should have scored against him did not, usually because his defence was elite, because batted balls found gloves, or because the bullpen escaped his inherited-runner messes. That gap is unstable. Mean reversion brings the ERA up to meet the FIP, often violently, over the next four to six starts.

The reverse case is the value play. When a pitcher’s ERA is significantly higher than his FIP, he has been unlucky. The runs landing against him are the result of bad sequencing, bad defence, or balls finding holes. His underlying performance has been better than the surface stat shows, and the market price is anchored to ERA. That is the spot to back him.

The threshold I use for “significant” is half a run — an ERA more than 0.50 above or below the FIP is a flag worth investigating. The bigger the gap, the more aggressively the mean reversion is going to bite. I have made some of my best plays of the season by fading pitchers carrying 2.50 ERAs and 4.20 FIPs into their seventh start. The slip pays at full underdog price, and the underlying numbers say the next start is the one where the wheels come off.

WHIP explained

Walks plus hits per inning pitched — WHIP — is the cleanest single-number measure of how many baserunners a pitcher allows. The formula is exactly what the name suggests — (walks + hits) divided by innings pitched. A WHIP of 1.20 means the pitcher allows 1.2 baserunners every inning, on average.

WHIP matters because runs need baserunners. A pitcher with a 0.95 WHIP is keeping fewer than one runner per inning on the bases, which means most innings end without a scoring threat. A pitcher with a 1.50 WHIP is putting one and a half runners on every inning, which means scoring threats develop more or less constantly. That difference is the bridge between great starts and disasters.

The threshold scale I use — WHIP under 1.00 is elite, Cy Young-level. 1.00 to 1.10 is excellent. 1.10 to 1.25 is good — solid front-rotation territory. The league average sits around 1.30, and that is also roughly the threshold between starters who help their team and starters who hurt it. Above 1.30 you are looking at a back-end arm or someone in a slump. Above 1.50 is genuinely poor, and the market should be pricing it accordingly.

WHIP is more stable than ERA from start to start, which makes it a better short-sample signal. A pitcher who has run a 1.15 WHIP across his last five starts has done so by limiting baserunners consistently — bad luck on balls in play might inflate his ERA, but the WHIP is honest. That stability is why I weight WHIP heavily when I am evaluating a pitcher whose ERA does not match the eye-test from the broadcasts.

The one caveat is that WHIP does not distinguish between walks and hits, and those two things have different downstream consequences. A pitcher who walks a lot but allows weak contact has a different risk profile than one who allows hard contact but rarely walks. For day-to-day matchup work, WHIP is the right single number. For deeper modelling, you split it into BB/9 and H/9 and look at each separately.

Using WHIP for over under bets

WHIP feeds directly into totals work because baserunners are the precondition for runs. Two starters with WHIP figures of 1.05 and 1.10 will both keep traffic light, so even with average bullpens, a 7.5 total is in play. Two starters at 1.40 and 1.45 will allow a parade of baserunners, and even a modest 8.5 total looks vulnerable to the Over.

The mistake casual punters make is reading ERA for totals work. ERA is too noisy in any given six-start window — a pitcher with a 2.40 ERA and a 1.35 WHIP is one bad inning away from a five-run start, and that is exactly the start that flips an Under slip. WHIP forces you to look at the underlying traffic. When the WHIPs of both starters average above 1.25, I lean Over almost reflexively, weather permitting. When they average below 1.15, I lean Under and look for confirmation in the weather and park context.

The deeper version of this analysis — splitting WHIP into expected vs actual using xFIP and BABIP — is covered in the FIP versus ERA deep dive. For most matchups, the raw WHIP read against the totals line is enough to flag the obvious mispricings.

wOBA and offensive metrics

On the hitter side, weighted on-base average — wOBA — is the most defensible single-number summary of offensive production. The premise is that not every offensive outcome is worth the same. A home run is worth more than a single. A walk is worth less than a double. Traditional OBP treats walks and singles as equivalent. wOBA assigns each outcome a linear weight derived from how much each outcome actually contributes to runs scored, then averages those weights across a hitter’s plate appearances.

The result is on a scale calibrated to OBP for readability. A wOBA of .320 is roughly the league average. .360 is excellent — All-Star territory. .400 is MVP-quality. Below .300 is below replacement, and below .280 is a hitter who should not be starting in the major leagues.

wOBA matters for betting because the offensive side of a matchup needs a single number to plug into your analysis, and most offensive single-numbers are misleading. Batting average ignores walks and power. OPS over-rewards slugging while under-rewarding contact. wOBA is the cleanest signal of true production.

The way I use wOBA in matchup work is to compare a starting pitcher’s recent FIP against the opposing lineup’s average wOBA. If the FIP is 4.20 and the lineup wOBA is .345, the matchup tilts toward the offence. If the FIP is 3.10 and the lineup wOBA is .305, the pitcher has the edge.

Two other offensive stats worth knowing — OPS+ and wRC+. Both are park-and-league-adjusted. OPS+ centres on 100 — anything above 100 is above league average, below 100 is below. 130 is excellent, 150 is MVP-tier. wRC+ uses the same centring convention but is derived from wOBA, which makes it slightly more reliable across different run-scoring environments. For team-level analysis — Yankees offence vs Red Sox offence — I look at team wRC+ rather than the more conventional batting-average-and-runs-scored figures because the park adjustment matters at Coors Field and Petco Park more than people credit.

Expected stats — xERA, xFIP, xwOBA

The “expected” family of stats — xERA, xFIP, xwOBA — is the second layer of sabermetric analysis, and it is where betting work separates from casual fandom. Each one asks the same question — given the underlying inputs we can measure with Statcast (exit velocity, launch angle, strike-zone command), what should this stat be over a representative sample, before luck on balls in play distorts the result?

xFIP takes FIP and replaces the actual home-run rate with the league-average home-run-per-fly-ball rate. The reason is that home runs are highly volatile in small samples. A pitcher who has allowed eight home runs in his first ten starts may have been unlucky — his fly-ball rate is normal, his location is normal, and the small sample is fooling FIP into looking worse than the true talent. xFIP corrects for that and tells you what FIP would be at a league-average HR/FB rate.

xwOBA does the same job on the hitter side. It looks at exit velocity and launch angle on every batted ball, calculates what wOBA the league averages would produce on a ball with those characteristics, and aggregates across the hitter’s batted balls plus strikeouts and walks. The result is a wOBA figure that ignores what actually happened to the ball (caught or not, ricocheting off the wall or not) and instead tells you what should have happened given the quality of contact.

xERA is the most recent and least common of the three. It draws on Statcast’s full batted-ball model to estimate the ERA the pitcher’s inputs would produce in a neutral environment. Useful as a cross-check on FIP and xFIP, but I do not use it as a primary signal.

BABIP — batting average on balls in play — is the underlying volatility check. The league-wide BABIP sits around .295. A pitcher with a BABIP of .240 against him is benefiting from balls finding gloves at an unsustainable rate. A pitcher with a BABIP of .360 against is bleeding hits that mean reversion will eventually slow. When I see a big BABIP gap from .295, I expect the surface stats to converge over the next month, and I bet accordingly.

Bullpen metrics

If pitching is the dominant variable, the bullpen is the dominant sub-variable that nobody outside the pro-betting world weights properly. The starter pitches the first six innings, give or take. The bullpen pitches three or four. If the bullpen ERA is two runs higher than the starter’s, you are buying a deteriorating product after the seventh inning every game.

Team bullpen ERA is the headline number, available on every major data site. Thresholds — under 3.50 is elite, 3.50 to 4.00 is good, 4.00 to 4.50 is average, above 4.50 is poor. The team-wide figure conceals depth, though, and depth matters more for betting than the average. A bullpen with three sub-3.00-ERA relievers and three above-5.00 looks “average” on paper but is fragile — if any of the elite arms is unavailable on a given night, the manager has to dip into the back of the bus, and that is where games are blown.

The metric that captures depth is bullpen FIP. Same logic as starter FIP — strip out defence and sequencing, look at what the relievers are doing in their own hands. A bullpen with a 3.20 team ERA but a 4.10 team FIP is benefiting from luck that will not last. A bullpen with a 4.30 team ERA and a 3.50 team FIP is doing better than the surface reads suggest and is poised for positive regression.

Leverage index — LI — is more granular. It measures how high-stakes the situation was when a reliever entered the game, on a scale where 1.00 is neutral. Relievers with high LI (1.50 or higher) are the ones the manager trusts in close games. The team’s high-leverage arms are the ones that matter for backing favourites — if the team’s top-LI reliever has thrown the past two nights, the manager will not have him tonight, and the late innings get murky.

Save conversion rate is the closer-specific metric. A closer converting 90%+ of save opportunities is reliable. Below 80% and you are in danger zone — the lead is not safe, and the moneyline favourite is asking you to trust someone who has blown four saves this month. I check the closer’s recent usage and save-conversion every time I lay a tight moneyline. Bullpen risk is the single most under-priced variable in MLB markets, especially on Sunday day games after three-game series, when both pens are on fumes.

Applying sabermetrics to a line

The Tuesday night I walked through this analysis with a colleague is the night I realised most punters do not have a workflow — they have a habit. So here is the workflow I run on every MLB matchup I bet, in order, with the threshold values laid out so you can apply it tonight.

Start with the starting pitchers. Pull their season FIP and their last-five-starts FIP separately. The season number tells you the baseline. The recent window tells you the current form. If season FIP is 3.50 but last-five FIP is 4.80, the pitcher is sliding and the market may not have priced it in. If the season is 4.50 and last-five is 3.20, he has found something and the price may not reflect it.

Then WHIP. Same two windows. WHIP is more stable from start to start than FIP, so a divergence between season and last-five is more meaningful — it usually signals an injury or a mechanical issue worth chasing.

Then bullpen ERA and FIP for both teams. If both bullpens are above league average, lean Under unless the offences are elite. If both are below, lean Over.

Then offence — lineup wOBA against the relevant handedness of the starter. Most lineup datasets break wOBA out by platoon — vs LHP and vs RHP. The pitcher’s handedness matters, especially for hitters with extreme splits.

Only then weather and park. Those are amplifiers — they push the conclusion you have already reached in one direction or the other. Twenty-eight degrees with wind out at Wrigley turns a “lean Over” into “back the Over hard.” Cold and damp at Citi Field turns a “lean Under” into “back the Under hard.”

Max Scherzer put it best in a 2025 interview — “We’re thinking like robots instead of thinking like a human, and trying to make decisions based on another human being in a box. That’s the challenge of pitching.” The sabermetric workflow gives you the numbers. The numbers are necessary. But pitching, at its core, is a contest between two human beings in a box, and the model will not capture every nuance of how that contest unfolds on the night. You take the numbers, then you read the start. That is the analyst’s job.

Where UK punters find the data

Three websites cover every metric in this article and most of what you need for daily MLB betting work. None of them require a subscription for the data I have described. None of them carry adverts that will mislead you. All three are accessible from the UK without a VPN.

FanGraphs is the workhorse. The player pages give you season-level FIP, WHIP, xFIP, wOBA, xwOBA, and every split you can imagine — vs LHP, vs RHP, home, road, last seven days, last fourteen, last thirty. The leaderboards let you filter pitchers and hitters by date range, which is essential for finding the recent-form information that the season totals smother. If I had to bet on a phone with internet for one site only, it would be FanGraphs.

Baseball Savant is the Statcast home. This is where you get the raw inputs that drive xFIP and xwOBA — exit velocity, launch angle, barrel rate, sprint speed. The interface takes a week to learn, but the data is the deepest you will find. Savant also publishes park-and-weather-adjusted numbers, which is useful for weather-sensitive totals work.

Baseball Reference is the historical archive. If you want to know how a pitcher has performed at Coors Field over his career, or what his FIP has been in day games vs night games over five seasons, this is the source. The interface is dated but the data is unimpeachable.

For UK punters, the practical workflow is — open FanGraphs for the matchup, cross-check anomalies on Baseball Savant, dip into Baseball Reference if you need historical context. The whole flow takes about ten minutes per game once you know what you are looking for, and it will give you a better read on the matchup than any tipster column you can find.

FAQ

Written by the editors at Betting Tips for Baseball.

MLB Bankroll and ROI Strategy: Variance, Staking, Home-Dog Edge | RunlineHQ

Staking, variance, CLV and the 2025 home-dog edge: an MLB bankroll blueprint built for the…

MLB Park Factors and Weather: Stadiums, Wind, Temperature | RunlineHQ

Park factors, altitude, wind direction and temperature shift MLB totals more than most lines admit.…

UK Bookmakers for MLB Betting: UKGC, Levy and Margins | RunlineHQ

How UKGC licensing, the 1.1% statutory levy and bookmaker margins shape MLB markets, plus what…