Skip to content
arxiv papers 1 min read

Safe-Child-LLM: A Developmental Benchmark for Evaluating LLM Safety in Child-AI Interactions

Link: http://arxiv.org/abs/2506.13510v1

PDF Link: http://arxiv.org/pdf/2506.13510v1

Summary: As Large Language Models (LLMs) increasingly power applications used bychildren and adolescents, ensuring safe and age-appropriate interactions hasbecome an urgent ethical imperative.

Despite progress in AI safety, currentevaluations predominantly focus on adults, neglecting the uniquevulnerabilities of minors engaging with generative AI.

We introduceSafe-Child-LLM, a comprehensive benchmark and dataset for systematicallyassessing LLM safety across two developmental stages: children (7-12) andadolescents (13-17).

Our framework includes a novel multi-part dataset of 200adversarial prompts, curated from red-teaming corpora (e.

g.

, SG-Bench,HarmBench), with human-annotated labels for jailbreak success and astandardized 0-5 ethical refusal scale.

Evaluating leading LLMs -- includingChatGPT, Claude, Gemini, LLaMA, DeepSeek, Grok, Vicuna, and Mistral -- weuncover critical safety deficiencies in child-facing scenarios.

This workhighlights the need for community-driven benchmarks to protect young users inLLM interactions.

To promote transparency and collaborative advancement inethical AI development, we are publicly releasing both our benchmark datasetsand evaluation codebase athttps://github.

com/The-Responsible-AI-Initiative/Safe_Child_LLM_Benchmark.

git

Published on arXiv on: 2025-06-16T14:04:54Z