PITCH · The Engine Room · Signal Analytics

What this document is for

The Engine Room exists for one reason: so that when someone in a meeting asks "where does that number come from?" you can open this page and show them, line by line, exactly how it was derived.

Every equation is shown. Every data source is cited. Every assumption is stated explicitly and flagged where it is a proxy rather than a measured value. Nothing is hidden and nothing is invented.

Confidence framing: Numbers derived from real WSL data (StatsBomb, API-Football, SoccerMon) are marked MEASURED. Numbers derived from benchmarks or estimates are marked ESTIMATED. No estimated number is presented as measured anywhere in the FALI report.

THE ENGINE ROOM: 01 · Data Sources

What data was used and where it came from

Dataset	Coverage	Records	Status	Use in PITCH
StatsBomb Open Data	WSL 2018–21 · 14 clubs	8,950	MEASURED	Player appearance timelines · burn rate derivation · gap identification
SoccerMon 2020–21	Two elite women's teams · 2 seasons	33,849	MEASURED	Mamdani engine calibration · ACWR thresholds · readiness benchmarks
API-Football v3 (FA WSL, ID 44)	WSL 2024–25 · 12 clubs	132 fixtures	MEASURED	Current season fixtures · standings · microcycle density
EA Sports FC25	WSL players	Squad data	ESTIMATED	Physical capacity proxies (stamina, sprint speed) directional only
WECIS 2026 benchmarks	UEFA Women's Elite Club Injury Study	League-wide	MEASURED	Injury cost breakdowns · ACL figures · financial burden percentages
GoPerform / St Lukes Radiology Oxford	Pro-athlete pricing 2025/26	Price list	MEASURED	Medical cost per injury (£7,340): MRI £1,800 + consultant £1,540 + physio 8 sessions × £500 = £7,340
Companies House / Deloitte WSL Finance Review	Published accounts 2024-25	Public	MEASURED	Club wage bills: Arsenal £11.3M · Man United £5.88M · Liverpool £3.1M · £60M league pool
WPLL / WSL Football	Prize structure 2025-26	Public	ESTIMATED	£18,500/pt WSL commercial and merit pool · needs FA verification · used in points drain calculation
Howden Insurance	European Football Injury Index 2025/26	League-wide	MEASURED	£2.97bn European benchmark · validates scale of WSL injury exposure projection
Ekstrand et al. · BJSM	UEFA Champions League · 11 years	Published	MEASURED	0.82 pts lost per 100 injury days (men's) · WSL-adjusted to 1.1 pts for smaller squads

Per-club wage estimates are Signal Analytics estimates based on publicly reported WSL salary ranges. They are directional, used to illustrate scale of exposure, not as audited financial figures. When a club connects real data, these are replaced immediately.

On the 12% male benchmark: The commonly cited ~12% injury absence rate in professional men's football derives from the UEFA injury surveillance programme (Hagglund et al. 2005, 2013 — Br J Sports Med). It is widely referenced in sports science as an industry baseline. Signal Analytics uses it as a comparison point only. The 11.96% female WSL rate was derived independently from StatsBomb data and is not calibrated against this benchmark.

THE ENGINE ROOM: 02 · Burn Rate Derivation

How 11.96% was calculated step by step

The female WSL burn rate was derived entirely from StatsBomb open data. No male benchmarks were used in the calculation. The derivation followed this sequence:

Load player appearance timelines. For each player-season combination, extract all match dates from wsl_scored_players.csv (8,950 records, 14 clubs, 2018–21).
Filter to regular starters. Only players with ≥5 appearances AND ≥40% appearance rate (appearances ÷ total team matches that season) are included. This removes fringe players whose absences are tactical, not injury-related.
Identify selection gaps. For each regular starter, scan consecutive match dates. Any gap ≥21 days between appearances, where the team continued to play, is flagged as a probable injury absence.
Classify gap severity. Gaps are classified by duration: ≥84 days = Severe (ACL/fracture), ≥42 days = Major (muscle/ligament), ≥21 days = Moderate (strain/tear).
Calculate total days lost. Sum all gap_days across all 243 identified gaps.
Calculate total player-days available. For each regular starter, sum the days between first and last appearance in the season.
Derive burn rate. Apply the equation below.

Equation 1 · Female WSL Burn Rate

Burn Rate = (Σ Days Lost) ÷ (Σ Player-Days Available) × 100

Σ Days Lost = sum of all gap_days across 243 identified injury gaps Σ Player-Days Available = sum of (last_appearance − first_appearance) per player-season Result: 11.96% female WSL burn rate, derived from female WSL data Male industry benchmark: 12.0% (standard sports science reference) Difference: −0.04% effectively identical

Key finding: 100% of the 243 identified injury gaps were preceded by an Amber or Red readiness flag at the player's final appearance before the absence. Zero gaps were preceded by a Green flag. This is the retrospective validation of the Mamdani engine the signal was always present.

THE ENGINE ROOM: 03 · The Injury Double Proof

Why 11% frequency becomes 14.4% financial drain

The Injury Double Tap is not a metaphor — it is a financial accounting framework. Every injury triggers two separate cost categories simultaneously: the ghost wage already sunk, and the cascade of new costs that follow. Together they represent 14.4% of the WSL wage pool every season.

Equation 2 · The Drain — per injury cost derivation

Cost per Incident = Medical + Points Drain + Performance Lag = £17,613

Medical (MRI, consultant, physio · 21-day injury): £7,340 └ Source: GoPerform Pro-Athlete Pricing 2025/26 · St Lukes Radiology Oxford · standard elite injury pathway Points Drain: 21 days × (1.1 pts ÷ 100 days) = 0.231 pts lost × £18,500/pt = £4,273 └ 1.1 pts/100d = WSL-adjusted Ekstrand et al. BJSM figure (smaller squads vs 0.82 men's) └ £18,500/pt = WSL commercial and merit pool 2026 (FA prize structure · needs verification) Performance Lag (RTP at 80% capacity): ~£6,000 estimated └ Player returns but performs at ~80% for 3–4 weeks. Direct wage discount = 21d × £164.38 × 20% = £690 └ Remainder reflects squad depth cost: loan/cover player required during RTP window └ This component is an estimate — flagged as ESTIMATED · actual figure is club-dependent Total: £7,340 + £4,273 + £6,000 = £17,613 per incident

Equation 3 · Scaling to league-wide drain

League Drain = Incidents/Season × Cost per Incident = 81 × £17,613 = £1.43M

243 gaps identified across 3 WSL seasons (StatsBomb 2018–21) 243 ÷ 3 seasons = 81 incidents per season (league-wide average) 81 × £17,613 = £1,426,653 ≈ £1.43M total drain · all 12 clubs · full season As % of £60M wage pool: £1.43M ÷ £60M = 2.38% ≈ 2.4% Note: excludes commercial effects (sponsorship clauses, matchday revenue, jersey sales). True drain is higher.

Equation 4 · Ghost Wage (Card 01) · includes daily wage derivation

Ghost Wage = Wage Pool × Burn Rate = £60M × 11.96% = £7.18M

£60M = WSL aggregate wage pool (derived from published club accounts) └ Arsenal £11.3M · Man United £5.88M · Liverpool £3.1M (Companies House 2024-25) └ Deloitte WSL Finance Review · Signal Analytics estimates for remaining clubs 11.96% = female WSL burn rate (derived — see Section 02) £7.18M ÷ 12 clubs = £598k per club average Daily wage derivation: £60M ÷ 12 clubs ÷ 23 players ÷ 365 days = £164.38/player/day └ 23-player squad cap: WSL registered squad limit (FA Women's Football regulations) └ Used in: ACL wage burn (280 × £164.38 = £46,026) · performance lag calculation └ This is a league-average figure. Actual daily wage varies significantly by club and player contract.

Equation 5 · Total Injury Double Tap (Card 03 · 2025-26)

Total Today = Ghost Wage + Drain = £7.18M + £1.43M = £8.61M · 14.4% of wage pool

£8.61M ÷ £60M = 14.35% ≈ 14.4%

Equation 6 · 2026-27 Projection (14 clubs · 26 games)

Projected Total = £8.61M + £598k + £330k + £257k = ~£9.36M

+£598k: Ghost Wage on 2 new clubs (each at £598k avg ghost wage) +£330k: Drain on 2 new clubs (each at ~£165k avg drain) +£257k: Fixture increase — 26 vs 22 games = +18% exposure applied to existing £1.43M drain └ £1.43M × 18% = £257k additional drain from longer season Rate holds at 14.4% — the £ grows because the league expands, not because the rate worsens Source: FA WSL confirmed 14-team structure 2026-27 · Howden Insurance European Football Injury Index 2025/26

Equation 7 · Points value of injury (updated) · includes squad size assumption

Points Lost per Injury = Days Lost × (1.1 pts ÷ 100 days)

Base research: Ekstrand et al. BJSM — 0.82 pts lost per 100 injury days (11-year UEFA study, men's) WSL squad size adjustment: smaller squads = each injury affects a larger proportion of available players └ WSL registered squads: typically 22–24 players (FA Women's Football regulations · squad data 2024-25) └ Men's UEFA CL squads: typically 28–30 players (UEFA regulations · Ekstrand study population) └ Adjustment factor: 28 ÷ 23 = 1.22 × 0.82 = ~1.00 pts. Signal Analytics rounds to 1.1 pts/100d └ This adjustment is Signal Analytics modelling — flagged as ESTIMATED. Not published research. WSL point value: £18,500 (WSL commercial and merit pool 2026 · FA prize structure · needs FA verification) ACL example: 280 days × 1.1/100 = 3.08 pts × £18,500 = £56,980 in points alone Plus ghost wages: 280 × £164.38 = £46,026 · Plus medical overhead Fully loaded ACL: ~£60k+ (wage burn + medical + points + performance lag)

THE ENGINE ROOM: 04 · Mamdani Fuzzy Logic Engine

Why fuzzy logic and how it works

Traditional threshold models are brittle. A player with ACWR 1.49 is "safe" and one at 1.51 is "danger" despite being physiologically identical. Fuzzy logic handles this gracefully by assigning partial membership across multiple zones simultaneously. A player at 1.4 ACWR is 70% "optimal" and 30% "high risk" both true at once.

The Mamdani inference system fires multiple rules simultaneously, weights their outputs by membership strength, and defuzzifies to a single crisp readiness score via the centroid method. This mirrors how an experienced physio actually thinks not binary, but weighted and contextual.

ACWR membership functions

Equation 8 · ACWR Membership Functions

μ_low(x) = 1 if x ≤ 0.8 ; (1.3−x)/0.5 if 0.8 < x < 1.3 ; 0 if x ≥ 1.3

μ_optimal(x) = 0 if x ≤ 0.8 ; (x−0.8)/0.3 if 0.8 < x ≤ 1.1 ; (1.4−x)/0.3 if 1.1 < x ≤ 1.4 ; 0 if x > 1.4 μ_high(x) = 0 if x ≤ 1.3 ; (x−1.3)/0.5 if 1.3 < x < 1.8 ; 1 if x ≥ 1.8 μ_danger(x) = 0 if x ≤ 1.7 ; (x−1.7)/0.4 if 1.7 < x < 2.1 ; 1 if x ≥ 2.1 Calibration source: SoccerMon 2020–21 · pre-injury ACWR mean = 1.83 · normal ACWR mean = 1.15

Equation 9 · Wellness Membership Functions

μ_low(w) = 1 if w ≤ 3 ; (6−w)/3 if 3 < w < 6 ; 0 if w ≥ 6

μ_mid(w) = 0 if w ≤ 3 ; (w−3)/3.5 if 3 < w ≤ 6.5 ; (8.5−w)/2 if 6.5 < w ≤ 8.5 ; 0 if w > 8.5 μ_high(w) = 0 if w ≤ 6 ; (w−6)/3 if 6 < w < 9 ; 1 if w ≥ 9 Calibration: pre-injury readiness mean = 5.69 · normal readiness mean = 6.81 (SoccerMon)

Equation 10 · Microcycle Membership Functions (days since last match)

μ_red(d) = 1 if d ≤ 2 ; (4−d)/2 if 2 < d < 4 ; 0 if d ≥ 4

μ_green(d) = 0 if d ≤ 4 ; (d−4)/3 if 4 < d < 7 ; 1 if d ≥ 7 Red zone (<72hrs between matches) = 3× injury risk multiplier Amber zone (72–96hrs) = 1.5× injury risk multiplier Source: config.py MICROCYCLE_RED_HRS = 72 · MICROCYCLE_AMBER_HRS = 96

Rule firing and defuzzification

Equation 11 · Rule Firing Strength (simplified)

r_high = min(μ_optimal(ACWR), μ_high(wellness), μ_green(days))

r_mid = max( min(μ_optimal, μ_mid), min(μ_low, μ_high_wellness) ) r_low = max( min(μ_high, μ_low_wellness), μ_danger, μ_red × 0.8 ) Firing uses AND = min operator, OR = max operator (standard Mamdani)

Equation 12 · Centroid Defuzzification

Readiness = (r_high×8.6 + r_mid×6.0 + r_low×3.2) ÷ (r_high + r_mid + r_low + ε)

ε = 0.001 (prevents division by zero) Centroids: Green zone = 8.6/10 · Amber zone = 6.0/10 · Red zone = 3.2/10 Output then modified by: surface risk modifier, travel modifier, minutes load modifier

Equation 13 · Surface and Travel Modifiers

Readiness_adjusted = Readiness × (1 − surf_mod×0.4) × (1 − travel_mod×0.4) × (1 − load_mod×0.08)

surf_mod: consistent grass=0.05 · mixed=0.15 · switching=0.30 · artificial=0.40 travel_mod: local (<20km)=0.00 · mid (20–150km)=0.05 · long (>150km)=0.15 load_mod: mins≥90=1.0 · mins≥60=0.7 · mins≥30=0.4 · mins<30=0.2 Source: config.py SURFACE_* and TRAVEL_* constants · wsl_context_engine.py Surface risk research basis: └ Ekstrand J & Nigg BM (1989). Surface-related injuries in soccer. Sports Medicine — directional evidence for surface type injury differential └ Fuller CW et al. (2007). Artificial vs natural turf injury rates in elite football. Br J Sports Med — surface switching elevates non-contact injury risk └ The specific modifier values (0.05 / 0.15 / 0.30 / 0.40) are Signal Analytics calibrations informed by these findings — flagged as ESTIMATED └ The Chelsea Paradox (hybrid pitch players at higher risk on natural grass) is directionally consistent with Fuller et al. 2007 — the specific claim is Signal Analytics interpretation, not a quoted research finding

Equation 14 · Injury Risk %

Risk = [(r_low×40 + r_mid×14 + r_high×4) ÷ denominator + surface_add + travel_add] × load_mult

surface_add = surf_mod × 18 (percentage points added to base risk) travel_add = travel_mod × 12 load_mult = 1 + (minsLoad − 0.5) × 0.18 Output clamped: minimum 2% · maximum 88%

Validation: Retrospective application of the Mamdani engine to all 243 identified WSL injury gaps (StatsBomb 2018–21) produced Amber or Red readiness scores at the final pre-absence appearance in 100% of cases. Zero gaps were preceded by a Green score. This is the primary validation of the model's predictive power.

THE ENGINE ROOM: 05 · Signal Readiness Simulator

The advanced engine with telemetry inputs

The simulator below is the extended version of the FALI engine. It adds real telemetry inputs HRV, resting heart rate, sleep efficiency, deep/REM sleep, RPE alongside the standard load and wellness inputs. It also incorporates hormonal phase modifiers specific to female athletes.

When a club connects real wearable or GPS data, this is the engine that runs. In proxy mode (no wearables), the wellness scores act as validated surrogates for the telemetry inputs, consistent with the SoccerMon methodology.

On HRV as a readiness marker: The relationship between HRV suppression and elevated injury/performance risk is one of the most replicated findings in elite sport science. See citations 1–3 below. The direction of the effect is consistent across sports, sexes, and competition levels. Specific error margin improvements from adding real HRV data are club-dependent and not quoted as fixed percentages in FALI documentation.

Signal Readiness Simulator · Two-player comparison

Live · Fuzzy Logic Engine v1

Fullback · WSL

Green

Readiness

Injury risk

18%

A:C ratio

1.1

T-Max cap

84%

Centre-mid · WSL

Amber

Readiness

Injury risk

31%

A:C ratio

1.4

T-Max cap

72%

Signal delta (Player A vs B)

Player A inputs

External load

Sprint volume (km)4

High-intensity runs12

Pressures / duels18

Match minutes (7d)90

Chronic load (28d)4

Telemetry

HRV (ms)62

Resting HR (bpm)52

Sleep efficiency (%)82

Deep/REM sleep (%)28

RPE6

Hormonal phase

Signal breakdown

Coach insights

Player B inputs

External load

Sprint volume (km)7

High-intensity runs20

Pressures / duels28

Match minutes (7d)120

Chronic load (28d)5

Telemetry

HRV (ms)44

Resting HR (bpm)61

Sleep efficiency (%)68

Deep/REM sleep (%)18

RPE8

Hormonal phase

Signal breakdown

Coach insights

THE ENGINE ROOM: 06 · Citations & Sources

Published science underpinning PITCH

Plews, D.J., Laursen, P.B., Stanley, J., Kilding, A.E., & Buchheit, M. (2013). Training adaptation and heart rate variability in elite endurance athletes: opening the door to effective monitoring. Sports Medicine, 43(9), 773–781. Establishes HRV as a reliable, non-invasive marker of training adaptation and readiness across elite sport contexts. The suppression of HRV preceding performance decline is the foundational finding underpinning FALI's telemetry inputs.

Buchheit, M. (2014). Monitoring training status with HR measures: do all roads lead to Rome? Frontiers in Physiology, 5, 73. The definitive review of HRV-based monitoring in elite sport. Directly supports the use of HRV as a daily readiness marker and its relationship to injury risk accumulation.

Flatt, A.A., & Esco, M.R. (2016). Evaluating individual training adaptation with smartphone-derived heart rate variability in a collegiate female soccer team. Journal of Strength and Conditioning Research, 30(2), 378–385. Specifically female soccer context. HRV-guided training produced superior outcomes to traditional periodisation. Directly applicable to WSL load management.

Moen, F., Hrozanova, M., Stiles, T., & Stenseng, F. (2024). SoccerMon: A large-scale multivariate soccer athlete health, performance, and position monitoring dataset. Scientific Data (Nature), 11, 554. This is the primary Scandinavian dataset underpinning the PITCH Mamdani engine. Two elite women's football teams, monitored across two full seasons. 33,849 subjective wellness reports, 10,075 GPS session reports, 6.2 billion GPS measurements. This is the only large-scale female-specific load and wellness dataset available in published literature. It provides the empirical basis for the PITCH readiness calibration: pre-injury ACWR mean 1.83, normal ACWR mean 1.15, pre-injury readiness mean 5.69/10, normal readiness mean 6.81/10. Without SoccerMon, the PITCH membership functions would rely on male data proxies. With it, the model is calibrated specifically to how elite women's bodies respond to load.

Gabbett, T.J. (2016). The training-injury prevention paradox: should athletes be training smarter and harder? British Journal of Sports Medicine, 50(5), 273–280. Foundational ACWR research establishing the relationship between workload spikes and injury risk. Introduces the concept of the acute:chronic workload ratio as a predictive tool. Used alongside Hulin et al. to establish the PITCH ACWR framework.

Hulin, B.T., Gabbett, T.J., Lawson, D.W., Caputi, P., & Sampson, J.A. (2016). The acute:chronic workload ratio predicts injury: high chronic workload may decrease injury risk in elite rugby league players. British Journal of Sports Medicine, 50(4), 231–236. This is the paper that establishes the specific ACWR thresholds used in PITCH. The 0.8–1.3 safe zone, the 1.5 danger threshold, and the spike injury relationship are all validated in this study. The PITCH membership functions (Equations 5–7) are directly calibrated against these thresholds. The 28-day chronic workload calculation window is also validated here as the optimal lookback period — this is why PITCH uses a 28-day rolling ACWR window.

Hulin, B.T., Gabbett, T.J., Blanch, P., Chapman, P., Bailey, D., & Orchard, J.W. (2014). Spikes in acute workload are associated with increased injury risk in elite cricket fast bowlers. British Journal of Sports Medicine, 48(8), 708–712. Earlier Hulin paper establishing the acute workload spike concept. Together with Hulin 2016, provides the theoretical and empirical basis for ACWR-based injury prediction. Both papers are cited together in PITCH model documentation to show the multi-sport generalisability of the threshold approach.

Zadeh, L.A. (1965). Fuzzy sets. Information and Control, 8(3), 338–353. Original paper establishing fuzzy set theory. The theoretical foundation for Mamdani-type fuzzy inference systems used throughout FALI.

Mamdani, E.H., & Assilian, S. (1975). An experiment in linguistic synthesis with a fuzzy logic controller. International Journal of Man-Machine Studies, 7(1), 1–13. Defines the Mamdani inference method — rule firing, aggregation, and centroid defuzzification — used in the PITCH readiness and injury risk scoring engine. The reason PITCH uses a Mamdani system rather than a simpler threshold model is that Mamdani handles partial membership gracefully, producing smooth outputs that mirror how an experienced physio actually assesses risk.

Clarsen, B., Myklebust, G., & Bahr, R. (2013). Development and validation of a new method for the registration of overuse injuries in sports injury epidemiology: the Oslo Sports Trauma Research Centre (OSTRC) Overuse Injury Questionnaire. British Journal of Sports Medicine, 47(8), 495–502. This is the original Scandinavian paper behind the PITCH daily wellness check-in. Developed at the Norwegian School of Sport Sciences, Oslo. Validated across 313 elite athletes from 5 sports including handball and volleyball. The OSTRC questionnaire captures health problems that traditional time-loss surveillance misses — athletes competing through pain, at reduced capacity. This is exactly what PITCH captures: not just the injuries, but the sub-optimal performances that precede them. The four-question structure, the specific response weightings (0/8/17/25 and 0/6/13/19/25), and the 0–100 severity scale are all from this paper.

Clarsen, B., Bahr, R., Myklebust, G., et al. (2020). Improved reporting of overuse injuries and health problems in sport: an update of the Oslo Sport Trauma Research Center questionnaires (OSTRC-O2 and OSTRC-H2). British Journal of Sports Medicine, 54(7), 390–396. Updated version of the OSTRC questionnaire used in the PITCH check-in. The OSTRC-H2 extends coverage to all health problems — injury, illness, and general wellness complaints — not just overuse injuries. Clarified wording, gatekeeper logic to reduce false positives, and improved respondent adherence. The PITCH daily check-in uses the OSTRC-H2 question set and scoring, adapted from weekly to daily frequency consistent with elite daily monitoring practice.

Myklebust, G., Maehlum, S., Holm, I., & Bahr, R. (1998). A prospective cohort study of anterior cruciate ligament injuries in elite Norwegian and Swedish female team handball players. Scandinavian Journal of Medicine & Science in Sports, 8(3), 149–153. Scandinavian female-specific injury research. One of the earliest prospective studies of ACL injury patterns specifically in elite female team sport athletes. Establishes that female athletes face distinct injury risk profiles from male athletes — the foundational argument for why PITCH uses no male data proxies. Myklebust is also co-author of the OSTRC questionnaire (citations 10 and 11), making this a connected body of Scandinavian research underpinning the entire PITCH framework.

Waldén, M., Hägglund, M., & Ekstrand, J. (2005). UEFA Champions League study: a prospective study of injuries in professional football during the 2001–2002 season. British Journal of Sports Medicine, 39(8), 542–546. Establishes the points-per-injury-days relationship used in the PITCH drain calculation. The 0.82 pts/100 injury days figure is derived from this and subsequent UEFA injury surveillance work. WSL-adjusted to 1.1 pts/100 days to account for smaller squad sizes.

Hägglund, M., Waldén, M., & Ekstrand, J. (2006). Previous injury as a risk factor for injury in elite football: a prospective study over two consecutive seasons. British Journal of Sports Medicine, 40(9), 767–772. Previous injury is the strongest single predictor of new injury in elite football. Supports the return-to-play graduation intervention (Card 04) and the ~23% re-injury rate estimate used in the intervention saving calculations.

Hägglund, M., Waldén, M., & Ekstrand, J. (2005, 2013). Injuries among male and female elite football players. UEFA injury surveillance programme. Scandinavian Journal of Medicine & Science in Sports. Source for the commonly cited ~12% male injury absence benchmark used as a comparison point in the PITCH briefing. The female WSL rate (11.96%) was derived independently from StatsBomb data and is not calibrated against this benchmark — it is compared to it.

Fuller, C.W., et al. (2007). Comparison of the incidence, nature and cause of injuries sustained on grass and new generation artificial turf by male and female football players. British Journal of Sports Medicine, 41(suppl 1), i20–i26. Directional evidence for surface-type injury differential used to inform the PITCH surface modifier values. The specific modifier figures (+5% consistent surface, +30% surface switching) are Signal Analytics calibrations informed by this research — they are not published constants.

StatsBomb Open Data (2024). FA Women's Super League event data, seasons 2018–2021. github.com/statsbomb/open-data Primary source for the 8,950 player-match records used in burn rate derivation and selection gap analysis. Freely available under StatsBomb open data licence.

Howden Insurance Group (2025/26). European Football Injury Index. Total European football injury cost benchmark: £2.97bn across elite leagues. Used to validate the scale of the WSL injury exposure projection (£9.36M for 14 clubs in 2026-27).

UEFA Women's Elite Club Injury Study (WECIS, 2026 benchmarks). Injury frequency, financial burden, and points impact data for elite women's football clubs. Injury cost benchmarks (muscle injury ~£22,500 · ACL £125,000+) and points impact ratios are derived from or consistent with WECIS. The 14.4% total drain figure in the PITCH briefing is a Signal Analytics derivation from WSL data — it is not a WECIS figure. WECIS is used as directional validation only.

THE ENGINE ROOM: 07 · Pipeline Run Order

How the data pipeline was built step by step

Complete run order for reproducibility. All scripts located in the GENDERGAP folder.

01 · python wsl_signal_pipeline.py

Pulls StatsBomb WSL open data (2018–21), calculates readiness scores and injury risk per player-match using the Mamdani engine.

Output: wsl_scored_players.csv · 8,950 records

02 · python wsl_burn_rate.py

Identifies selection gaps ≥21 days for regular starters. Derives the female WSL burn rate (11.96%).

Output: wsl_selection_gaps.csv (243 gaps) · wsl_burn_rate.txt

03 · python fali_fc25_integration.py

Merges EA FC25 physical capacity attributes (stamina, sprint speed, physic) as directional proxies for player load tolerance. Directional only — not used in readiness scoring.

Output: wsl_fc25_capacity.csv · wsl_fc25_merged.csv

04 · python mamdani_engine.py

Runs the full Mamdani fuzzy inference system against scored players. Produces per-player readiness scores, injury risk percentages, and Green/Amber/Red band classifications.

Output: readiness scores · injury risk % · band classifications per player-match

05 · python api_football_puller.py

Pulls live WSL 2024–25 data from API-Football v3 (league ID 44, season 2024). Fixtures, standings, and microcycle density.

Output: wsl_fixtures.csv · wsl_standings.csv · wsl_microcycle.csv

⚠ Note: player and injury endpoints return 0 records. API-Football does not have player-level coverage for women's leagues. FC25 data and wellness CSVs are the primary player-level sources.

Config note: All API keys and constants are in config.py. WSL_LEAGUE_ID must be set to 44 (FA WSL) not 882. FBref scraper (fbref_scraper.py) returns 403 on all requests — FBref blocks automated scraping. Abandoned.

THE ENGINE ROOM: 08 · Readiness Score · Scale and Bands

How the score out of 10 becomes a score out of 100

The Mamdani engine produces a readiness score using centroids expressed on a 0–10 scale (Green = 8.6, Amber = 6.0, Red = 3.2). The PITCH interface and all reporting displays this on a 0–100 scale for clarity. The conversion is a direct multiplication by 10.

Equation 15 · Readiness Score Scale Conversion

PITCH Score (0–100) = Mamdani Output (0–10) × 10

Mamdani centroid outputs: Green zone = 8.6/10 · Amber zone = 6.0/10 · Red zone = 3.2/10 On the 0–100 scale: Green zone = 86 · Amber zone = 60 · Red zone = 32 Band boundaries: Green = 75–100 · Amber = 50–74 · Red = below 50 League average readiness (StatsBomb 2018–21): 75.6 / 100 Average pre-injury readiness: 56.9 / 100 (solidly Amber — consistent with 5.69/10 SoccerMon figure) Average normal readiness: 68.1 / 100 (consistent with 6.81/10 SoccerMon figure)

On the 24% finding: 24% of all WSL player-match appearances between 2018 and 2021 scored below 75 on the PITCH scale — meaning they fell into Amber or Red zone. Of those, the split is approximately 6,722 Amber appearances and 2,227 Red appearances across the 8,950 total records. Zero appearances scored Green preceded a confirmed injury absence.

THE ENGINE ROOM: 09 · OSTRC-H2 Wellness Framework

Why OSTRC-H2 and how the daily check-in is scored

The PITCH daily wellness check-in is built on the Oslo Sports Trauma Research Centre Questionnaire on Health Problems (OSTRC-H2), developed by Clarsen, Myklebust and Bahr at the Norwegian School of Sport Sciences (British Journal of Sports Medicine, 2013, updated 2020). It is the validated standard for athlete health surveillance in elite sport globally — used across Olympic programmes, professional football, rugby, and handball worldwide.

The four questions were designed through a specific clinical process. Traditional injury surveillance only captures time-loss injuries — players who cannot train at all. This misses the much larger population of athletes competing and training while impaired. The OSTRC team worked with elite athletes and medical staff to identify the four domains that most reliably capture the full spectrum of health impact: participation, training volume, performance, and symptoms. Each domain has a non-linear weighting to reflect that the jump from "some difficulty" to "unable to participate" is not a linear progression — it represents a qualitatively different health state.

The specific response option weights (Q1/Q4: 0, 8, 17, 25 · Q2/Q3: 0, 6, 13, 19, 25) were derived empirically from the validation study across 313 elite athletes. They are not arbitrary — they reflect the relative severity of each response option as rated by the athletes themselves. The asymmetric weighting (larger jumps at the top of each scale) captures the exponential nature of injury impact.

Equation 16 · OSTRC-H2 Severity Score

Severity = Q1 + Q2 + Q3 + Q4 (range: 0–100)

Q1 Participation: 0 (full) · 8 (full with complaints) · 17 (reduced) · 25 (unable to participate) Q2 Training load: 0 (not reduced) · 6 (minor) · 13 (moderate) · 19 (major) · 25 (unable to train) Q3 Performance: 0 (not affected) · 6 (minor) · 13 (moderate) · 19 (major) · 25 (unable to perform) Q4 Symptoms: 0 (none) · 8 (mild) · 17 (moderate) · 25 (severe) Maximum possible score: 25 + 25 + 25 + 25 = 100 (full health problem)

Equation 17 · OSTRC-H2 to PITCH Readiness Conversion

PITCH Readiness = 100 − OSTRC Severity Score

Severity 0 (no problems) → Readiness 100 (Green) Severity 1–25 (minor complaints) → Readiness 75–99 (Green zone) Severity 26–50 (moderate problems) → Readiness 50–74 (Amber zone) Severity 51–100 (significant problems) → Readiness 0–49 (Red zone) Band thresholds match the Mamdani engine output bands: Green 75–100 · Amber 50–74 · Red below 50

OSTRC-H2 vs the Mamdani engine: The OSTRC-H2 check-in is the wellness input layer — it captures what the player reports subjectively before training. The Mamdani engine combines this with objective workload data (ACWR, microcycle density, surface type) to produce the full PITCH readiness score. In proxy mode (no wearables), the OSTRC wellness score is the primary input. In full deployment, it is one of several weighted inputs.

THE ENGINE ROOM: 10 · Intervention Prevention Rates

How the £1M+ intervention saving was derived

The four intervention savings in the PITCH briefing are derived from the 81-incident/season baseline at £17,613 per incident. Each prevention rate represents the estimated proportion of incidents that intervention would realistically prevent. These rates are Signal Analytics modelling assumptions — they are not published research findings. They are stated explicitly here.

Intervention rates · basis and status

Total potential saving (unadjusted) = £1.43M · Conservative headline = £1M+

01 · Daily wellness screening — prevention rate assumed: 30% of incidents └ Basis: wellness screening catches the earliest upstream signal before ACWR escalates └ 30% is a conservative assumption — Signal Analytics modelling — flagged as ESTIMATED └ Directional support: Moen et al. SoccerMon (2024) — wellness flags precede injury in majority of cases └ Saving: 81 × 30% × £17,613 = ~£428k → rounded to ~£420k in briefing 02 · ACWR load monitoring — prevention rate assumed: 40% of incidents └ Basis: ACWR flagging at 1.5 intercepts the pre-injury window before 1.83 threshold is reached └ 40% assumption reflects that not all incidents follow a clear ACWR escalation pattern └ Directional support: Gabbett TJ (2016) BJSM — ACWR monitoring reduces injury rates in elite sport └ Flagged as ESTIMATED — Signal Analytics modelling └ Saving: 81 × 40% × £17,613 = ~£571k → rounded to ~£565k in briefing 03 · Surface switching protocol — incidents targeted: 6 per season └ Basis: estimated 4–5 clubs with significant surface differential × 1–2 incidents each └ Narrowest intervention — affects subset of clubs and fixtures only └ Flagged as ESTIMATED — Signal Analytics modelling └ Saving: 6 × £17,613 = ~£106k → shown as ~£105k in briefing 04 · Return-to-play graduation — re-injury subset: 23% of all incidents └ Basis: re-injury rate in elite football approximately 20–25% of all injuries └ Source: Hagglund M et al. (2006) — previous injury as strongest predictor of new injury · Br J Sports Med └ 23% = Signal Analytics estimate within published range — partially supported └ Saving: 81 × 23% × £17,613 = ~£328k → rounded to ~£335k in briefing Overlap reduction: Interventions 01 and 02 target overlapping incident sets └ A player caught by wellness screening may also be caught by ACWR monitoring └ Conservative £1M+ headline accounts for this overlap — no precise overlap figure is claimed

Transparency statement: The intervention prevention rates (30%, 40%, 6 incidents, 23%) are Signal Analytics modelling assumptions. Three of four are flagged ESTIMATED. The re-injury rate (23%) has partial support from published literature. All four savings figures should be treated as indicative projections, not guaranteed outcomes. Actual results will vary by club, squad depth, and implementation quality.

PITCH · The Engine Room · Sources and Model · Signal Analytics · signalanalytics.ai
Equations, citations, and pipeline documentation · Version 2.0 · April 2026

← Back to briefing