Over/under goals betting strategies using xG models for football bets

Table of Contents

Why xG matters when you bet over/under goals

You already know that raw goals scored and conceded are noisy — a team can underperform or overperform for stretches due to luck, injuries, or variance. Expected goals (xG) gives you a cleaner signal by estimating the quality of chances a team creates and concedes. When you use xG to inform over/under goals bets, you’re replacing noisy historical counts with a measure that better reflects the actual likelihood of goals occurring in a match.

Using xG models shifts your focus from simplistic stats to chance-quality and context. That helps you identify matches where bookmaker lines may be mispriced because the market is anchored to recent scorelines rather than underlying chance creation.

Key components you’ll use in an xG-based over/under strategy

1. Choosing or building an xG model

At a minimum, your xG model should account for shot location and shot type. More advanced models include assist type, body part, phase of play (open play, set piece), and defensive pressure. You can use public xG datasets or build a lightweight model that weights shot distance and angle — the goal is consistency and interpretability, not perfection.

2. Translating xG into goal probabilities

Once you have expected goals for each team (home_xG and away_xG), you convert them into probability distributions for final scorelines. Commonly you’ll assume goals follow a Poisson or bivariate Poisson process:

Use Poisson(home_xG) and Poisson(away_xG) to get the probability of 0,1,2,… goals for each side.
Combine those distributions to produce probabilities for total goals (sum of both scores).
Adjust for correlation if you want to account for game-state effects (e.g., a red card or late chase).

3. Finding value against bookmaker lines

Compare your model’s probability that total goals will be over a line (e.g., 2.5) with the implied bookmaker probability. Implied probability = 1 / decimal_odds (adjusted for margin). If your model gives a 55% chance of over 2.5 but the bookmaker implies 48%, you’ve found value. Keep a record of your edge and only act when the expected value (EV) is positive after staking costs.

Focus on markets where variance is lower (e.g., 0.5–3.5 rather than extreme totals).
Prioritise matches with reliable xG inputs — leagues with shot-location data are better.
Be cautious around late injuries, team news, and weather, which can invalidate your model quickly.

With these building blocks — a consistent xG model, a method to convert xG into goal probabilities, and a value-check process against bookmaker odds — you’re ready to formalise rules for which lines to bet and how to size stakes. In the next section, you’ll learn practical calibration techniques, how to handle bookmaker margins, and sample staking strategies to manage risk and grow your bankroll responsibly.

Calibrating your xG probabilities to reality

Raw xG outputs are a great starting point, but every model has bias. Calibration means aligning your model’s predicted probabilities with observed outcomes so your edges against bookmakers are real and not artifacts of systematic over- or under-estimation.

Practical calibration steps:

Backtest on a holdout set: group matches by predicted total-goals probability (e.g., 0–10%, 10–20%, …) and compare predicted vs actual frequency of overs in each bucket. Look for consistent over- or under-prediction.
Use simple regressions or isotonic regression to map predicted probabilities to observed frequencies. A monotonic mapping (isotonic) preserves ranking but corrects scale.
Monitor calibration metrics: Brier score measures probabilistic accuracy; calibration slope/intercept diagnose bias. Aim to reduce Brier score and have slope close to 1, intercept close to 0.
Account for overdispersion: if your Poisson assumption consistently underestimates variance, consider negative binomial or add a small variance inflate factor when generating score distributions.
Recalibrate regularly (weekly/monthly) and separately for different competitions — lower leagues, cups and international fixtures often need different adjustments.

Finally, track model reliability over time. If edges evaporate after accounting for calibration, your model either needs more features (game state, lineup strength) or you’re bumping into sharp market efficiency.

Handling bookmaker margins and market nuances

Bookmakers include a margin (vig) that distorts implied probabilities. Before comparing your model’s over/under probabilities with market odds, remove that margin to get the fair-market odds.

Convert decimal odds to implied probabilities: implied = 1 / odds. Sum all implied probabilities for competing outcomes (over and under). The overround = sum_implied.
Normalize to remove margin: fair_prob = implied / overround. These fair_probs are what you compare to your model.
Beware of market movement: lines can drift as sharp bettors or bookmakers react to information. Closing lines are best for evaluation but not always available for execution.
Shop lines: small differences in totals (2.25, 2.5, 2.75) and odds can matter. Use multiple bookmakers or exchanges to find the best market and reduce slippage when placing bets.
Consider liquidity and max stakes. A theoretical edge is useless if you can’t place a meaningful stake at those odds. Exchanges often allow larger stakes but have their own match risk.

Staking strategies and practical risk controls

Once you’ve identified positive EV bets, decide how much to risk. Two practical approaches suit xG-based over/under strategies:

Flat staking: stake a fixed percentage of your bankroll (commonly 0.5–2%). This is simple and robust to model mis-specification.
Kelly criterion (fractional Kelly recommended): for decimal odds b and model probability p, full Kelly = (b*p – 1) / (b – 1). Use 10–25% of Kelly to reduce variance and protect against calibration errors.

Risk-controls to implement:

Only bet when your edge exceeds a minimum threshold (e.g., >3–5% after removing vig).
Limit the number of simultaneous bets and set a daily/weekly exposure cap.
Log every bet with model inputs, odds, stakes and outcomes. Analyze returns by market, league and edge size to refine thresholds and recalibration cadence.
In-play betting: if you use live xG, be conservative with latency and market reaction — edges often vanish quickly in-play.

With calibration, margin adjustments and disciplined staking you convert model insights into executable, risk-managed over/under strategies that can be reviewed and improved over time.

Putting the model to work

Building an xG-driven over/under workflow is an iterative process: start small, keep rigorous records, and let the market and your own tracking guide incremental changes. Maintain discipline around calibration, stake sizing and market selection, and treat every losing streak as information, not a verdict. If you want ongoing reference material or model examples, reputable analytics outlets such as StatsBomb publish methods and datasets that can help you refine features and validation practices.

Frequently Asked Questions

How reliable is xG for predicting total goals compared with past scorelines?

xG is generally more reliable than raw goals because it reflects chance quality rather than finished results, reducing variance from luck. However, xG is not perfect — it requires calibration, league-specific tuning, and occasional adjustments for game-state and lineup changes. Treat xG as a probabilistic input, not a certainty.

Should I use flat stakes or Kelly for xG-based over/under bets?

Both have pros and cons. Flat staking is simple and robust to model errors; fractional Kelly (e.g., 10–25% of full Kelly) can improve growth when your edge estimates are accurate but increases volatility and sensitivity to calibration mistakes. Many bettors start with flat stakes and move to conservative Kelly once their model and track record are stable.

Can I apply this approach to in-play markets using live xG?

Yes — live xG can highlight changing probabilities during a match, but in-play betting demands fast execution, low latency, and stricter risk controls because lines move quickly and market efficiency increases. Use smaller stakes, require larger edges, and account for delays between model signals and available odds.