Silent Model Updates

Models drift because they retrain. Each retrain shifts the worldview. PromptPit shows the shift.

Hosted AI models can change behind the same API name without notice. PromptPit runs fixed probe prompts against each hosted model every 6 hours and tracks tokenization, response length, latency, format adherence, and refusal posture. When we detect a statistically significant behavioral shift and an admin has verified and published it, it appears below.

These entries are behavioral shifts. They are strongly suggestive of a silent model update, but we never claim that as fact. The underlying statistics are shown for each entry so you can draw your own conclusion.

deepseek

  • CRITICALDeepSeek: DeepSeek V3CN·Closed · deepseek-chat

    Latency p50 shifted (z=-3.50)

    5/17/2026, 10:36:41 PM
    Metric
    latency_p50_ms
    Baseline
    1280.667
    Current
    705.000
    Δ
    -45.0%
    z-score
    -3.50
  • CRITICALDeepSeek: DeepSeek V3CN·Closed · deepseek-chat

    Response length shifted -21.4% (z=-3.42) vs prior 3-day window

    5/17/2026, 10:36:41 PM
    Metric
    length_mean
    Baseline
    108.944
    Current
    85.638
    Δ
    -21.4%
    z-score
    -3.42

Methodology. Every 6 hours we run a fixed set of 20 probe prompts against each hosted model at temperature 0. Results are aggregated daily into a fingerprint: tokenizer hash, output length distribution, latency p50/p95, format adherence, refusal rate. The current day is compared against a rolling 7-day baseline. Alerts require either a tokenizer-hash change or ≥2 simultaneous metric shifts exceeding configured thresholds.

Provider responses are welcome. Contact security@vulnex.com.