Link: http://arxiv.org/abs/2501.09431v1
PDF Link: http://arxiv.org/pdf/2501.09431v1
Summary: While large language models (LLMs) present significant potential forsupporting numerous real-world applications and delivering positive socialimpacts, they still face significant challenges in terms of the inherent riskof privacy leakage, hallucinated outputs, and value misalignment, and can bemaliciously used for generating toxic content and unethical purposes after beenjailbroken.
Therefore, in this survey, we present a comprehensive review ofrecent advancements aimed at mitigating these issues, organized across the fourphases of LLM development and usage: data collecting and pre-training,fine-tuning and alignment, prompting and reasoning, and post-processing andauditing.
We elaborate on the recent advances for enhancing the performance ofLLMs in terms of privacy protection, hallucination reduction, value alignment,toxicity elimination, and jailbreak defenses.
In contrast to previous surveysthat focus on a single dimension of responsible LLMs, this survey presents aunified framework that encompasses these diverse dimensions, providing acomprehensive view of enhancing LLMs to better serve real-world applications.
Published on arXiv on: 2025-01-16T09:59:45Z