I Built a 100-Match Soccer Analytics Dashboard in a Weekend. No Build Tools.
No npm. No Vite. No bundler. Just HTML, vanilla JavaScript, D3, and free StatsBomb data. Here's how, and what I picked up about xG, xT, and 2010-era web tech.
- A soccer analytics dashboard with 9 stat overlays and 100 historic matches, going back to Pelé's 1958 World Cup. Match-clock replay, player filter, two analytics charts.
- Built as a static site with no build step. You double-click
index.htmland it works. - Uses StatsBomb Open Data for real match events, and Karun Singh's published xT grid for the Expected Threat layer.
- About 3,000 lines of source plus a 15 MB data file.
- Everything is on GitHub: https://github.com/armrosadev1991/soccer_analytics. Fork it, deploy it, or just open it and play with the WC 2022 Final.
The "why"
I wanted to look at a soccer match the way pro analysts do. Shot maps, pass networks, expected-goal arcs over time, replay-style scrubbing through a match. The existing tools all have a catch. Wyscout and StatsBomb IQ cost serious money. Opta won't talk to you unless you run a club. mplsoccer lives in Jupyter, which is fine for me but useless for showing friends.
So I built one. The constraint was tight: zero build tools. One folder of files, double-click to open, no setup, no Node, no Python at runtime. The starter-project market today is all Vite/Next.js scaffolds that take 10 minutes to install and produce 200 MB of node_modules to draw a pitch. There had to be a simpler way.
There is. Plain 2010-era web tech.
What you can do with it
Pick a match, then pick a story to tell.
The picker has 100 matches across 21 competitions. Every World Cup Final from 1958 (Pelé's debut) through 2022 (Argentina vs France). Every Champions League Final from 2004 (Monaco vs Porto) through 2019 (Tottenham vs Liverpool). Recent Euros, Women's Euros, and a deep selection of knockout matches. The Maradona Hand of God game is in there. So is the Istanbul Miracle from 2005.
Pick the 2022 World Cup Final, then start switching tabs.
Layer 1: Shots
Every shot is a circle. Size is proportional to √xG, so the bigger the dot, the better the chance. Color is binary: red for goal, grey for saved or missed or blocked. Hover any dot for player, team, xG value, and outcome. That's the shot map you've seen on every football analytics Twitter account.
Layer 2: Carries (with motion)
This is where it stops being a viz and starts being a thing you watch. Hit Play and the dots animate along their carry paths in real time. You see Messi dribbling. Mbappé sprinting. The ball moving across the pitch in time. Drop the speed to 1× (real-time) and you can re-watch the whole match in 90 minutes from the pure-event view.
Layer 3: Pass Network
Pick Barcelona vs Manchester United 2011 UCL Final, switch to Network. You see Xavi, Iniesta, and Busquets forming a tight triangle in midfield. The tiki-taka triangle from the Guardiola years, made visible by data. The thicker the line between two players, the more passes they exchanged. The bigger the dot, the more involved the player. No code in app.js knows what tiki-taka is. The data shows it anyway.
Layer 4: Expected Threat (xT)
I'll explain xT properly below. For now: each arrow is a pass or carry that pushed the team closer to scoring, colored by how much. The deep-red arrows are the through-balls and decisive carries that broke the defense.
Layer 5: Lineups
Each circle is a starter with their jersey number. Hover for full name and position. The coordinates come from StatsBomb's role labels (Right Wing, Center Defensive Midfield, and so on) mapped to canonical pitch zones. A 4-3-3 ends up looking visibly different from a 3-5-2, even though the code never explicitly detects the formation.
The two metrics that make it real
Three of the layers are built around one idea. Shots aren't all equal, and neither are passes. xG and xT are how analytics formalizes that.
xG (Expected Goals)
xG is a probability between 0 and 1 that an average shooter would score from a given situation. It factors in shot location, angle, body part, defenders nearby, and the pass that led to it.
It exists to fix the shot-count lie. A team taking 20 long-range hopeful shots looks more dominant than one taking 5 close-range chances, but the second team is way more likely to score. xG measures quality of chances, not just quantity.
The xG Timeline shows each team's cumulative xG climbing as the match goes on. Each step-up is a shot; the height of the step is its xG. Goals are marked with filled dots at the baseline. You can read the story arc at a glance: who was creating more, when momentum shifted, whether the scoreline matched the performance.
The xG Advantage chart is the same information, one step further. A single line of home_xG − away_xG over time. When the line is above zero, the home team was creating the better chances. The WC 2022 Final reads cleanly. Argentina dominant early (line above zero). A deep dip during Mbappé's hat-trick stretch in the 80s. Back near zero through extra time. The 3-3 scoreline tells you the result. This chart tells you the match.
xT (Expected Threat)
xT fills the gap xG can't. A perfect through-ball that breaks the defense gets zero xG credit. Only the shot it leads to. xT was invented to value the buildup.
The model divides the pitch into a 12×8 grid and assigns each cell a value: the probability that possession in this cell leads to a goal in the next ~5 actions. Move the ball from a low-xT zone to a high-xT zone, and you've added xT equal to the difference. A back-pass under pressure has negative xT.
Karun Singh published this grid in a 2019 blog post. He derived it from open StatsBomb data and made it free for anyone to use. It's the de-facto industry standard now. The xT layer in my dashboard uses his exact grid, bundled in config.js.
Why this matters in plain language: xG explains what teams did with their chances. xT explains how they built up those chances. Together they're a complete picture of attacking play.
The architecture (in three concepts)
1. Load order is the dependency graph
There are zero import statements anywhere in the project. Instead, index.html has 14 <script> tags in a specific order. Each script installs a global. Each subsequent script depends only on globals from above:
vendor/d3.min.js → d3
config.js → window.CONFIG, window.xtAt, window.filterEvents
flags.js → window.flagFor
data/matches.js → window.MATCHES, window.MATCH_INDEX
pitch.js → window.drawPitch
layers/*.js → window.LAYERS.{name} (each layer self-registers)
replay.js → window.Replay
analytics.js → window.Analytics
app.js → IIFE bootstrapThis is pre-bundler tech from ~2012. Still valid HTML. Still zero tooling. A new feature is a new <script> tag.
2. Self-registering layers
Each layer file installs itself into a global registry:
window.LAYERS = window.LAYERS || {};
window.LAYERS.your_layer = {
render(g, data, opts) { /* draw inside g */ },
legend(data) { return [/* legend items */]; }
};app.js never knows layer names ahead of time. It iterates window.LAYERS based on the buttons in the DOM. Adding a 10th layer is three small edits: a new layer file, a new <script> tag, a new <button>. No other file changes. That's the cleanest extension model I've ever shipped.
3. The data file IS the bundle
data/matches.js is 15 MB of pretty-printed JSON wrapped in window.MATCHES = { ... };. It's loaded as a <script> global because browsers block fetch('data/matches.json') from file://. The "null origin" can't make cross-origin requests, regardless of CORS headers. The <script> tag has no such restriction.
This one decision is what unlocks the double-click-to-run property. Cost: 15 MB on disk. Benefit: zero infrastructure.
The data pipeline
The 15 MB doesn't appear by magic. There's a 366-line Python script (scripts/convert_statsbomb.py) that:
- Fetches StatsBomb's open dataset from GitHub. No API key needed; it's all in their public repo.
- Filters events to Shot, Pass, Carry, Ball Receipt*, Tackle, Interception, Block, Clearance, Ball Recovery.
- Normalizes StatsBomb's 120×80 pitch coords to our 105×68 (UEFA standard) via a
0.875 × 0.85scale. - Subsamples passes/touches/carries to per-match caps (150/350/120) to keep file size reasonable.
- Bundles lineup data and match metadata (referee, stadium, score, competition stage).
- Validates every match against a schema before writing.
The whole script is stdlib-only. No requests, no pandas. It runs in about 10 minutes for 100 matches. Most of that is downloading the ~5 MB events JSON per match from GitHub.
The bugs that taught me something
Bug 1: All shots on one side
For an embarrassing amount of time, the shots layer showed every shot from both teams on the right side of the pitch. The culprit:
StatsBomb stores every team's events as if they attack left-to-right.
So home and away shots both have x near 88-100 in the raw data, because that's how StatsBomb encodes "near the team's attacking goal." Render the raw coordinates and they all land on the same side.
The fix was a 4-line mirrorAwayEvents(match) function that flips away-team x and end_x to W − x at match-load time. Idempotent via a _mirrored: true flag, so switching matches doesn't double-flip. Every layer downstream automatically picked up the corrected coordinates.
Lesson: read the data dictionary before you trust the field names. "x and y are the location," sure — but in whose frame of reference?
Bug 2: Tooltips clipping at the viewport edge
Hover a shot near the right edge of the pitch and the tooltip would overflow off-screen. Classic CSS positioning bug.
The fix: measure the tooltip's getBoundingClientRect() after rendering text, then clamp left/top so the tooltip can't escape the viewport. Was already wired with fixed assumed dimensions; switching to actual measurement made it work for tooltips of any length. Including the long accented player names like "Ángel Fabián Di María Hernández."
Lesson: with dynamic content, position after you know the dimensions, not before.
Bug 3: Corners disappearing into the void
Corner-kick events are at exactly (0, 0), (0, 68), (105, 0), (105, 68). The corners of the pitch. The SVG viewBox was "0 0 105 68", meaning corner events rendered right at the boundary and half-clipped off-screen.
The fix was three changes. Extend the viewBox to "-4 -4 113 76". Add a darker-green "surround" rect for the grass apron. Throw in four yellow corner flags for charm. Now throw-ins, goal-line shots, and corners all render in full.
Lesson: the SVG viewBox is not the same as your data range. Padding is free.
What it looks like in motion
Here's the moment I knew it was worth building. Scrub to minute 50 of Argentina vs England 1986, click Carries, hit Play. Watch Maradona's carry from the halfway line. The actual coordinate trail of the Goal of the Century, recovered from event data 39 years later.
That's why I built this. To watch matches that ended before you were born, with the data they generated, in a browser tab.
What's next
I skipped some Tier-2 work because they're multi-day items:
- TypeScript migration. Would need a build step (vs the no-build-tools constraint), but worth it for the data schema.
- StatsBomb 360 freeze frames. They have player-position snapshots at every event for some matches. Adding a "pressure" layer using this would be a major upgrade.
- Cross-match aggregation. "Show me Messi's shots across all 4 finals he played" is a UI overhaul, not a layer addition.
- Team comparison mode. Side-by-side pitches.
What's there already does most of what I wanted. Pick a famous match, pick a story, watch it unfold.
Try it / Get the code
The whole project lives on GitHub — https://github.com/armrosadev1991/soccer_analytics. To run it:
- Clone the repo.
- Double-click
index.html. - That's it.
No build, no install, no Node. The data is bundled, D3 is vendored, the page works offline from the first load.
If you want to add your own matches, scripts/convert_statsbomb.py takes a --match <id> --comp <id> --season <id> flag to pull any specific match from StatsBomb's open data into your local data/matches.js.
Final thought
The most surprising thing wasn't the analytics. It was that 2010-era web tech still works for non-trivial projects. Plain <script> tags. Global namespaces. Vendored libraries. No bundler. 3,000 lines of source, every line one click from inspection in DevTools. Nothing compiled, nothing magic.
There's something refreshing about that, in 2026.
👉 Visit the project
The full source is on GitHub — go check it out, fork it, and try it yourself:
https://github.com/armrosadev1991/soccer_analytics https://armrosadev1991.github.io/soccer_analytics/
If you found this useful, [follow / clap / share]. If you want to fork it and add features, the GitHub repo has a CONTRIBUTING-friendly architecture doc and a CHANGELOG that explains what every session added.
Comments
No comments yet. Be the first to leave one below.
Leave a comment