Speech Models Fail Safety Tests That Text Passes
VoxSafeBench reveals speech language models recognize social norms in text but ignore them when cues arrive through voice, speaker identity, or environment.
Speech language models degrade on safety, fairness, and privacy when context shifts from text to audio cues.
- — VoxSafeBench tests SLMs across safety, fairness, and privacy using matched text and audio pairs.
- — Tier 1 evaluates identical content in text and speech; Tier 2 tests benign transcripts with risky acoustic context.
- — Models detect speaker identity, tone, and environment but fail to apply appropriate safeguards based on these cues.
- — Safety awareness drops when speaker or scene context arrives through speech rather than text description.
- — Fairness erodes when demographic differences are conveyed vocally instead of stated explicitly.
- — Privacy protections weaken when contextual information must be grounded in acoustic signals.
- — A speech grounding gap exists: models recognize norms in text but do not enforce them in speech.
- — Benchmark covers 22 tasks with bilingual coverage to validate findings across languages.
Astrobobo tool mapping
- Knowledge Capture Record the speech grounding gap concept and the three-tier risk model (content, speaker, environment) as a reusable checklist for safety audits of multimodal systems.
- Focus Brief Summarize the 22 tasks and their failure modes into a one-page reference for your team's next safety review or model evaluation cycle.
- Reading Queue Queue the full paper and the VoxSafeBench dataset documentation for deeper study of Tier 2 task design and perception probe methodology.
Frequently asked
- The speech grounding gap is the failure of speech language models to apply safety, fairness, and privacy rules when the decisive cue arrives through voice rather than text. Models recognize a social norm when it is stated explicitly in text but ignore the same norm when it must be inferred from speaker identity, tone, accent, or environment. This creates a systematic vulnerability in shared-space voice systems.
cite ▸
Yuxiang Wang, Hongyu Liu, Yijiang Xu, Qinke Ni, Li Wang, Wan Lin, Kunyu Feng, Dekun Chen, Xu Tan, Lei Wang, Jie Shi, Zhizheng Wu. (2026, April 17). Speech Models Fail Safety Tests That Text Passes. Astrobobo Content Engine (rewrite of arxiv/cs.LG). https://astrobobo-content-engine.vercel.app/article/speech-models-fail-safety-tests-that-text-passes-210565
Yuxiang Wang, Hongyu Liu, Yijiang Xu, Qinke Ni, Li Wang, Wan Lin, Kunyu Feng, Dekun Chen, Xu Tan, Lei Wang, Jie Shi, Zhizheng Wu. "Speech Models Fail Safety Tests That Text Passes." Astrobobo Content Engine, 17 Apr 2026, https://astrobobo-content-engine.vercel.app/article/speech-models-fail-safety-tests-that-text-passes-210565. Based on "arxiv/cs.LG", https://arxiv.org/abs/2604.14548.
@misc{astrobobo_speech-models-fail-safety-tests-that-text-passes-210565_2026,
author = {Yuxiang Wang, Hongyu Liu, Yijiang Xu, Qinke Ni, Li Wang, Wan Lin, Kunyu Feng, Dekun Chen, Xu Tan, Lei Wang, Jie Shi, Zhizheng Wu},
title = {Speech Models Fail Safety Tests That Text Passes},
year = {2026},
url = {https://astrobobo-content-engine.vercel.app/article/speech-models-fail-safety-tests-that-text-passes-210565},
note = {Astrobobo rewrite of arxiv/cs.LG, https://arxiv.org/abs/2604.14548},
}