Analytics Workbench — DynastyFootballFan.com

Hit Rate Matrix — What percentage of historical prospects at each DMX decile produced a Star or Starter season? Star/Starter (default) requires a peak-season threshold within the eval window: QB ≥200 PPR pts, RB/WR ≥120, TE ≥75. Stars require higher peaks (QB 300, RB/WR 200, TE 150). Eval windows: 5 yrs (QB/RB/WR), 7 yrs (TE — structurally longer development). Each cell shows hit rate and on hover, the Wilson 95% confidence interval (the honest precision given that decile's sample size). Cohort: 2001–2020 (TE through 2018). Toggle the metric to Star Only, Top-20% PPG (legacy percentile-based definition), Opportunity (% who got NFL snaps), or Conditional (Star/Starter rate among those who played). Star/Starter thresholds are absolute PPR points and not era-adjusted — marked for future review. For out-of-sample model validation see the Model Performance tab.

Hit Rate by Decile RB/WR/QB: 5-yr window · TE: 7-yr window · 2001–2020 classes

Metric: % with a Star or Starter peak season (QB ≥200, RB/WR ≥120, TE ≥75 PPR pts)

Loading…

Offense Curveshit rate by decile · TE shows 7-yr window

IDP Curveshit rate by decile

Model Performance — How well does DMX predict NFL careers when validated out-of-sample? Predictions for each draft class are built from training data strictly before that class (time-series cross-validation, expanding window). Test cohort: 2011–2020 prospects (2011–2018 for TE). For each (position, decile), the predicted Star/Starter probability is the historical hit rate in earlier classes; the predicted tier is the historical mean tier rounded to the nearest integer. For empirical hit-rate lift by decile, see the Hit Rate Matrix tab. This tab focuses on whether the model is honest about itself.

Position

Out-of-Sample PerformanceTime-series cross-validation, all positions

Calibration Plot predicted vs observed Star/Starter rate

Each dot is a DMX decile in the held-out test set. Dots above the 45° line mean the observed hit rate exceeded the training-era prediction; below means it fell short. Vertical bars are 95% confidence intervals on the observed rate. Systematic above-line deviation indicates an era effect (modern hit rates are higher than training-era rates).

Tier Confusion Matrix predicted × actual, out-of-sample

Rows show what the model predicted; columns show what actually happened. Each cell is the percentage of that predicted-tier row landing in each actual tier. Diagonal cells = correct predictions. Cohen’s κ is the chance-corrected agreement rate (0 = random, 1 = perfect). Bust–Contributor and Starter–Star confusions are common — the model separates the extremes well but adjacent tiers blur.

Methodology & Limitations

Time-series cross-validation (TSCV): For each prospect in the test set (draft years 2011–2020), we predict their outcome using only training data from earlier draft years. This expanding-window approach mimics how the model would have been used in real time and gives the most honest out-of-sample estimate.

Why decile-based prediction: Each (position, decile) bucket is the canonical DMX grouping used throughout the platform. Predicting the historical hit rate of that bucket avoids parametric assumptions (linearity, normality) that don’t hold for censored 5-yr VBD outcomes. It’s the same approach a statistician would use for empirical-Bayes calibration.

Brier skill score: Compares the model’s Brier score (mean squared error of predicted probability vs observed outcome) against a naive baseline of always predicting the overall hit rate. Positive means the model adds predictive value beyond the base rate; the magnitude indicates how much.

Cohen’s κ interpretation (Landis & Koch 1977): <0.20 slight, 0.21–0.40 fair, 0.41–0.60 moderate, 0.61–0.80 substantial, >0.80 near-perfect. Tier classification at this prediction horizon — from a single composite score for a five-year outcome — lands in the fair-to-moderate range. That’s realistic for the problem.

Limitations:

Era effect: hit rates roughly doubled from 2001–2010 to 2011–2020 due to higher offensive output league-wide. Predictions trained on older data systematically under-predict modern hit rates — visible as upward drift on the calibration plot.
Sample size: QB deciles average ~12 prospects each, generating wide confidence intervals. Conclusions at QB depend on smaller cohorts than WR/RB.
Star/Starter thresholds are absolute (QB 200, RB/WR 120, TE 75 PPR pts). Era-adjusted thresholds (position rank within season) would be more rigorous — marked for future review.
Ceiling on predictability: 60–80% of career outcome variance comes from post-draft factors (coaching, scheme, injury, QB play, team success) that no pre-draft model can capture. Brier skill scores in the 0.08–0.27 range are consistent with this ceiling.

Draft Capital Efficiency — VBD ROI by draft round. 1st-round RBs (picks 1–10) produce 220.8 average 5-year VBD at an 85.7% hit rate — the single best dynasty investment by both metrics. Round 2-3 QBs have a 2.9% star rate from 34 prospects — the worst.

85.7%

RB Rd1 Hit Rate

picks 1-10 · n=63

220.8

RB Rd1 Avg VBD

5-yr cumulative PPR

73.8%

WR Rd1 Hit Rate

picks 1-10 · n=126

2.9%

QB Rd2-3 Star Rate

1 star from 34 prospects

Metric

Draft Capital Efficiency by Position & Round Tier

Full BreakdownVBD · Hit% · Star%

Positional Scarcity Waves — % above replacement-level fantasy value year-by-year post-draft. Dynasty sell windows are visible in the slope of each curve. RBs cliff. WRs plateau. TEs slow-burn. QBs are bimodal — all-or-nothing.

Position Deciles

% Above ReplacementRB

Avg VBD by Year Post-Draft

Position Insights

Select position and click Update.

Player Profiles — descriptive archetype taxonomy from k-means clustering on DMX-orthogonal residuals. Profiles describe how a prospect earned their score (playing style, college role) rather than predicting career outcomes. Use DMX and DPX deciles for hit-rate signal. RBs split into 4 profiles, Edge defenders into 3. Beta — v1.0

Position group

RB Edge

Find a player

—

Profiles in group

—

Prospects classified

—

Largest profile

—

Features per player

Loading profiles…

Methodologywhy these are descriptive, not predictive

Profiles are k-means cluster assignments computed on DMX-orthogonal residuals — the DMX score axis is mathematically removed before clustering, so profiles describe shape and composition rather than overall quality. Two players with the same DMX can land in different profiles, and two players in the same profile can have very different DMX scores.

Cluster counts (k) were determined empirically per position group via silhouette analysis: 4 for RB (n=158 cluster-ready) and 3 for Edge (n=125). Validation against career outcomes confirmed these clusters are descriptively interpretable but do not meaningfully differ in hit-rate at p<0.05 — which is why we use them as taxonomy rather than prediction.

Players are assigned via centroid distance in residual space, with a minimum-features-present threshold (7 of 9 RB features; 5 of 7 Edge features). Players missing too many features are intentionally left unassigned rather than mislabeled.

Cross-Era Analysis — how has DMX evolved across draft classes? Track the D1 threshold, class composition, and component drift from 2000 to 2026. Also compare any two draft classes head-to-head on ATH/DPOS/AWP profiles.

Position Component

WR Avg DMX by Draft Year

Class Comparison Toolside-by-side ATH · DPOS · AWP radar

Class A vs Class B

Combine-to-Career Regression — which combine tests actually predict NFL success, controlling for draft capital? Key findings: DPOS dominates all individual tests at every position. 40 time is the most overrated metric. Broad jump is the best single athleticism predictor for QBs and TEs. AWP (college production) outperforms every combine test except DPOS.

~1.5%

ATH Alone R²

all positions — athleticism barely predicts

-0.25

TE 40-Time Corr

best single combine test correlation

0.202

QB Broad Jump Corr

best combine test for quarterbacks

13.4%

RB DPOS R²

draft capital vs VBD — strongest signal

View

Combine Test Correlation with 5-Yr VBDby position

AWP Predictive Premiumstandalone R² vs ATH vs DPOS vs DMX

Key Insight

DPOS (draft capital z-score) is the single most predictive variable at every position — explaining 5.6–13.4% of 5-year VBD variance on its own. AWP (college production) adds 3.2–8.2% of additional standalone signal, making it the most valuable athleticism-independent predictor. ATH alone explains only 1.4–3.0% — raw athleticism from the combine has minimal direct predictive power when separated from draft position and college context. The 40-yard dash is particularly overrated: it explains near-zero variance for WRs and RBs. Broad jump and vertical are better single-test predictors for most positions. The DMX composite is superior to any individual test by design — this data confirms the architecture choice.

Distribution Analysis — DMX score distributions within and across deciles. Shows how cleanly the model separates quality tiers. Also includes individual prospect ATH/DPOS/AWP radar profiles for scouting context — select any draft year to see the full class breakdown.

Position View Year (Radar)

WR DMX Distribution by Decile

Draft Board Heat MapATH · DPOS · AWP strength by draft class

Interactive Scatter — plot any DMX component pair against career outcomes. Hover for player names and stats. Color = career tier. Reveals where predictive signal concentrates and where it breaks down entirely.

Position X Axis Y Axis

DMX vs 5-Yr VBD — WRhover for player · color = career tier

Star Starter Contributor Bust

AI Analytics Assistant — natural language queries grounded in the actual DFF dataset. Full context injected automatically: DMX/DPX methodology, hit rates, R² values, combine regressions, draft capital efficiency, and residual analysis findings. Saves queries to your library for future reference.

Natural Language AnalyticsPowered by Claude

Cmd+Enter to submit

Saved Query Library

No saved queries yet.

Context: DMX formula, all position hit rates, R² values, combine correlations, draft capital data, residual analysis, 25-yr historical scope.