Methodology

How the model works, and how well it actually does.

Every number on this site comes out of one pipeline: public box scores and play-by-play go in, opponent-adjusted ratings and simulated games come out. Here's the whole thing, plus the receipts.

The pipeline

It starts with the same data anyone can pull: every WNBA box score and play-by-play feed back to 2020. From there it runs in four steps.

  1. Rate the teams, adjusted for who they played

    A ridge regression splits every game into an offense and a defense, then asks how many points each team adds or gives up per 100 possessions relative to its schedule. Beating a good defense counts for more than running up the score on a bad one. Recent games carry more weight — the model leans on the last 4 months and lets older results fade. The regression is shrunk hard toward league average, because a WNBA season is only ~40 games and small samples lie.

  2. Add home court and pace

    Home teams get a fixed edge worth about +2.5 points, estimated from the data. A separate model sets the game's expected pace — possessions per team — because the same two teams produce very different scores in a track meet than they do in a grind.

  3. Simulate the game thousands of times

    Instead of spitting out one predicted score, the model plays each matchup 20,000 times, drawing each team's efficiency from its rating with realistic game-to-game noise. Count how often each side wins and you get a win probability; average the scores and you get a projected final score and total.

  4. Project the players

    Player lines follow the same logic: minutes times a recency-weighted scoring, rebounding, and assist rate, adjusted for the opponent and the game's pace. Counts that pile up in small chunks — points, rebounds, assists — are modeled with a distribution that fits real box scores better than a plain bell curve.

Calibrated, not confident

The model is tuned to be honest, not to look smart. A 70% means the favorite really should win about seven times in ten — including the three it doesn't. That's the whole goal: when it says 70%, teams in that bucket win 70% of the time.

Honesty also means the model will sometimes disagree with the consensus on a given game, and it's supposed to. Matching the room on every matchup would just make it a mirror. The value is in the disagreements that turn out right — and you can only trust those if the probabilities are calibrated in the first place.

Does it actually work?

To check, the model is run walk-forward: for every game in 2024 & 2025, it's trained only on games that finished earlier, then made to predict cold. No peeking at the result it's being graded on. That's 576 honest predictions.

0.618 Log-loss lower is better · coin flip = 0.693
65% Picked right straight-up winner
10.0 Margin error avg pts off the final margin
12.9 Total error avg pts off the combined total

Against the obvious alternatives

A score means nothing without something to beat. Here's the model next to three naive predictors on the same games — graded by log-loss, where a perfect call is 0 and a coin flip is 0.693.

PredictorLog-lossPicked right
dubmetrics model0.61865%
Elo rating0.62065%
Always home0.68755%
Coin flip0.69345%

Coin flip guesses 50% every time; always-home backs the home team every game; Elo is a bare-bones rating system with no opponent adjustment or pace. Beating Elo is the real bar — and it's a close, honest margin, not a blowout.

Calibration

Group every prediction by the win chance it stated, then check how often those teams actually won. On the line means the stated win chance was right.

00252550507575100100Predicted win %Actual win %
Each dot is a bucket of games grouped by predicted win chance. Dots on the diagonal mean the stated win chance matched what actually happened; dot size reflects how many games landed in that bucket.

What it can't do

The data starts in 2020, so the model has no memory of the league before then. Seasons are short, rosters turn over, and injuries land late — a projection is only as fresh as the lineup it was built on. Treat it as a well-calibrated baseline, not a guarantee.

Ridge α 5 · half-life 120d · 20,000 sims/game ·updated June 16, 2026