Silent Model Updates

Models drift because they retrain. Each retrain shifts the worldview. PromptPit shows the shift.

Hosted AI models can change behind the same API name without notice. PromptPit runs fixed probe prompts against each hosted model every 6 hours and tracks tokenization, response length, latency, format adherence, and refusal posture. When we detect a statistically significant behavioral shift and an admin has verified and published it, it appears below.

These entries are behavioral shifts. They are strongly suggestive of a silent model update, but we never claim that as fact. The underlying statistics are shown for each entry so you can draw your own conclusion.

deepseek

CRITICALDeepSeek: DeepSeek V3CN·Closed · deepseek-chat
Latency p50 shifted (z=-3.50)
5/17/2026, 10:36:41 PM
Metric
latency_p50_ms
Baseline
1280.667
Current
705.000
Δ
-45.0%
z-score
-3.50
CRITICALDeepSeek: DeepSeek V3CN·Closed · deepseek-chat
Response length shifted -21.4% (z=-3.42) vs prior 3-day window
5/17/2026, 10:36:41 PM
Metric
length_mean
Baseline
108.944
Current
85.638
Δ
-21.4%
z-score
-3.42

Methodology. Every 6 hours we run a fixed set of 20 probe prompts against each hosted model at temperature 0. Results are aggregated daily into a fingerprint: tokenizer hash, output length distribution, latency p50/p95, format adherence, refusal rate. The current day is compared against a rolling 7-day baseline. Alerts require either a tokenizer-hash change or ≥2 simultaneous metric shifts exceeding configured thresholds.

Provider responses are welcome. Contact security@vulnex.com.