Drift Detection
Drift detection catches statistical shift in model inputs or outputs before it shows up as accuracy loss. Each drift monitor compares a recent window of production data against a reference baseline using one of three methods.
Find it under MLOps > Drift. The feature is gated by REACT_APP_ENABLE_DRIFT_DETECTION.
Concepts
- Drift Monitor: a configuration that periodically compares a feature set against a reference baseline.
- Method: the statistical test used:
- PSI (Population Stability Index) — bucket-based comparison, common for tabular features
- KS Test (Kolmogorov–Smirnov) — distribution-free test on continuous features
- JS Divergence (Jensen–Shannon) — symmetric divergence between probability distributions
- Threshold: the score above which drift is considered detected (per-method).
- Drift Report: one comparison run with an
overall_scoreand a per-feature breakdown. - Status (derived):
- Healthy — no drift detected
- Warning — score within 70% of the threshold (approaching)
- Drift Detected — score exceeds threshold
Creating a Drift Monitor
- Go to MLOps > Drift → Create Monitor.
- Configure:
- Name and model
- Method: PSI / KS Test / JS Divergence
- Reference dataset: the baseline distribution (typically the training distribution)
- Threshold (defaults are method-specific)
- Run frequency (how often to score current production data against the baseline)
- Save.
Dashboard
The drift dashboard lists every monitor with status (Healthy / Warning / Drift Detected), method, and latest score. Click a monitor to see:
- Score history over time
- Per-feature drift contribution (which features are shifting most)
- Links to individual drift reports
Drift Reports
Each scoring run produces a Drift Report with:
- Overall score and method used
- Per-feature scores ranked by contribution
- Boolean
drift_detectedflag - Timestamp
Use the report detail view to identify which inputs are shifting and decide whether to retrain.
Wiring Drift to Retraining
Pair a drift monitor with an on_drift Pipeline to retrain automatically when drift fires. A typical pipeline:
- Train on the latest data
- Evaluate against the previous champion
- (Optional) require an approval
- Deploy if better
Choosing a Method
| Method | When to use |
|---|---|
| PSI | Tabular features, especially in finance / risk where PSI is the convention |
| KS Test | Continuous numeric features without a natural bucketing |
| JS Divergence | Categorical features or already-bucketed distributions |
Use Cases
- Detect upstream data-pipeline changes
- Catch seasonal distribution shift (e.g. new product launch, holiday traffic)
- Trigger retraining when input distribution diverges from training
- Audit a model before promoting it from staging to production
Next Steps
- Pipelines — wire drift into automatic retraining
- Model Monitoring — combine input drift with runtime metrics
- Experiment Tracking — log retraining runs triggered by drift