Hit Rate Matrix — What percentage of historical prospects at each DMX decile produced a Star or Starter season? Star/Starter (default) requires a peak-season threshold within the eval window: QB ≥200 PPR pts, RB/WR ≥120, TE ≥75. Stars require higher peaks (QB 300, RB/WR 200, TE 150). Eval windows: 5 yrs (QB/RB/WR), 7 yrs (TE — structurally longer development). Each cell shows hit rate and on hover, the Wilson 95% confidence interval (the honest precision given that decile's sample size). Cohort: 2001–2020 (TE through 2018). Toggle the metric to Star Only, Top-20% PPG (legacy percentile-based definition), Opportunity (% who got NFL snaps), or Conditional (Star/Starter rate among those who played). Star/Starter thresholds are absolute PPR points and not era-adjusted — marked for future review. For out-of-sample model validation see the Model Performance tab.
Hit Rate by Decile RB/WR/QB: 5-yr window · TE: 7-yr window · 2001–2020 classes
Metric: % with a Star or Starter peak season (QB ≥200, RB/WR ≥120, TE ≥75 PPR pts)
Loading…
Offense Curveshit rate by decile · TE shows 7-yr window
IDP Curveshit rate by decile
Model Performance — How well does DMX predict NFL careers when validated out-of-sample? Predictions for each draft class are built from training data strictly before that class (time-series cross-validation, expanding window). Test cohort: 2011–2020 prospects (2011–2018 for TE). For each (position, decile), the predicted Star/Starter probability is the historical hit rate in earlier classes; the predicted tier is the historical mean tier rounded to the nearest integer. For empirical hit-rate lift by decile, see the Hit Rate Matrix tab. This tab focuses on whether the model is honest about itself.
Position
Out-of-Sample PerformanceTime-series cross-validation, all positions
Calibration Plot predicted vs observed Star/Starter rate
Each dot is a DMX decile in the held-out test set. Dots above the 45° line mean the observed hit rate exceeded the training-era prediction; below means it fell short. Vertical bars are 95% confidence intervals on the observed rate. Systematic above-line deviation indicates an era effect (modern hit rates are higher than training-era rates).
Tier Confusion Matrix predicted × actual, out-of-sample
Rows show what the model predicted; columns show what actually happened. Each cell is the percentage of that predicted-tier row landing in each actual tier. Diagonal cells = correct predictions. Cohen’s κ is the chance-corrected agreement rate (0 = random, 1 = perfect). Bust–Contributor and Starter–Star confusions are common — the model separates the extremes well but adjacent tiers blur.
Methodology & Limitations

Time-series cross-validation (TSCV): For each prospect in the test set (draft years 2011–2020), we predict their outcome using only training data from earlier draft years. This expanding-window approach mimics how the model would have been used in real time and gives the most honest out-of-sample estimate.

Why decile-based prediction: Each (position, decile) bucket is the canonical DMX grouping used throughout the platform. Predicting the historical hit rate of that bucket avoids parametric assumptions (linearity, normality) that don’t hold for censored 5-yr VBD outcomes. It’s the same approach a statistician would use for empirical-Bayes calibration.

Brier skill score: Compares the model’s Brier score (mean squared error of predicted probability vs observed outcome) against a naive baseline of always predicting the overall hit rate. Positive means the model adds predictive value beyond the base rate; the magnitude indicates how much.

Cohen’s κ interpretation (Landis & Koch 1977): <0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, >0.80 near-perfect. Tier classification at this prediction horizon — from a single composite score for a five-year outcome — lands in the fair-to-moderate range. That’s realistic for the problem.

Limitations:

  • Era effect: hit rates roughly doubled from 2001–2010 to 2011–2020 due to higher offensive output league-wide. Predictions trained on older data systematically under-predict modern hit rates — visible as upward drift on the calibration plot.
  • Sample size: QB deciles average ~12 prospects each, generating wide confidence intervals. Conclusions at QB depend on smaller cohorts than WR/RB.
  • Star/Starter thresholds are absolute (QB 200, RB/WR 120, TE 75 PPR pts). Era-adjusted thresholds (position rank within season) would be more rigorous — marked for future review.
  • Ceiling on predictability: 60–80% of career outcome variance comes from post-draft factors (coaching, scheme, injury, QB play, team success) that no pre-draft model can capture. Brier skill scores in the 0.08–0.27 range are consistent with this ceiling.
Draft Capital Efficiency — VBD ROI by draft round. 1st-round RBs (picks 1–10) produce 220.8 average 5-year VBD at an 85.7% hit rate — the single best dynasty investment by both metrics. Round 2-3 QBs have a 2.9% star rate from 34 prospects — the worst.
85.7%
RB Rd1 Hit Rate
picks 1-10 · n=63
220.8
RB Rd1 Avg VBD
5-yr cumulative PPR
73.8%
WR Rd1 Hit Rate
picks 1-10 · n=126
2.9%
QB Rd2-3 Star Rate
1 star from 34 prospects
Metric
Draft Capital Efficiency by Position & Round Tier
Full BreakdownVBD · Hit% · Star%
Positional Scarcity Waves — % above replacement-level fantasy value year-by-year post-draft. Dynasty sell windows are visible in the slope of each curve. RBs cliff. WRs plateau. TEs slow-burn. QBs are bimodal — all-or-nothing.
Position Deciles
% Above ReplacementRB
Avg VBD by Year Post-Draft
Position Insights
Select position and click Update.
Player Profiles — descriptive archetype taxonomy from k-means clustering on DMX-orthogonal residuals. Profiles describe how a prospect earned their score (playing style, college role) rather than predicting career outcomes. Use SM and DM deciles for hit-rate signal. RBs split into 4 profiles, Edge defenders into 3. Beta — v1.0
Position group
RB Edge
Find a player
Profiles in group
Prospects classified
Largest profile
Features per player
Loading profiles…
Methodologywhy these are descriptive, not predictive

Profiles are k-means cluster assignments computed on DMX-orthogonal residuals — the DMX score axis is mathematically removed before clustering, so profiles describe shape and composition rather than overall quality. Two players with the same DMX can land in different profiles, and two players in the same profile can have very different DMX scores.

Cluster counts (k) were determined empirically per position group via silhouette analysis: 4 for RB (n=158 cluster-ready) and 3 for Edge (n=125). Validation against career outcomes confirmed these clusters are descriptively interpretable but do not meaningfully differ in hit-rate at p<0.05 — which is why we use them as taxonomy rather than prediction.

Players are assigned via centroid distance in residual space, with a minimum-features-present threshold (7 of 9 RB features; 5 of 7 Edge features). Players missing too many features are intentionally left unassigned rather than mislabeled.

Cross-Era Analysis — how has DMX evolved across draft classes? Track the D1 threshold, class composition, and component drift from 2000 to 2026. Also compare any two draft classes head-to-head on ATH/DPOS/AWP profiles.
Position Component
WR Avg DMX by Draft Year
Class Comparison Toolside-by-side ATH · DPOS · AWP radar
Class A vs Class B
Combine-to-Career Regression — which combine tests actually predict NFL success, controlling for draft capital? Key findings: DPOS dominates all individual tests at every position. 40 time is the most overrated metric. Broad jump is the best single athleticism predictor for QBs and TEs. AWP (college production) outperforms every combine test except DPOS.
~1.5%
ATH Alone R²
all positions — athleticism barely predicts
-0.25
TE 40-Time Corr
best single combine test correlation
0.202
QB Broad Jump Corr
best combine test for quarterbacks
13.4%
RB DPOS R²
draft capital vs VBD — strongest signal
View
Combine Test Correlation with 5-Yr VBDby position
AWP Predictive Premiumstandalone R² vs ATH vs DPOS vs DMX
Key Insight

DPOS (draft capital z-score) is the single most predictive variable at every position — explaining 5.6–13.4% of 5-year VBD variance on its own. AWP (college production) adds 3.2–8.2% of additional standalone signal, making it the most valuable athleticism-independent predictor. ATH alone explains only 1.4–3.0% — raw athleticism from the combine has minimal direct predictive power when separated from draft position and college context. The 40-yard dash is particularly overrated: it explains near-zero variance for WRs and RBs. Broad jump and vertical are better single-test predictors for most positions. The DMX composite is superior to any individual test by design — this data confirms the architecture choice.

Distribution Analysis — DMX score distributions within and across deciles. Shows how cleanly the model separates quality tiers. Also includes individual prospect ATH/DPOS/AWP radar profiles for scouting context — select any draft year to see the full class breakdown.
Position View Year (Radar)
WR DMX Distribution by Decile
Draft Board Heat MapATH · DPOS · AWP strength by draft class
Interactive Scatter — plot any DMX component pair against career outcomes. Hover for player names and stats. Color = career tier. Reveals where predictive signal concentrates and where it breaks down entirely.
Position X Axis Y Axis
DMX vs 5-Yr VBD — WRhover for player · color = career tier
Star Starter Contributor Bust
AI Analytics Assistant — natural language queries grounded in the actual DFF dataset. Full context injected automatically: DMX/DPX methodology, hit rates, R² values, combine regressions, draft capital efficiency, and residual analysis findings. Saves queries to your library for future reference.
Natural Language AnalyticsPowered by Claude
Cmd+Enter to submit
Saved Query Library
No saved queries yet.
Context: DMX formula, all position hit rates, R² values, combine correlations, draft capital data, residual analysis, 25-yr historical scope.