The resources related to the trustworthiness of large models (LMs) across multiple dimensions (e.g., safety, security, and privacy), with a special focus on multi-modal LMs (e.g., vision-language models and diffusion models).
-
This repo is in progress 🌱 (manually collected).
-
Badges:
-
🔥🔥🔥 Help us update the list! 🔥🔥🔥
- First, check papers through our database: Metadata of LM-SSP.
- If you want to update the information of a paper (e.g., an arXiv paper has been accepted by a venue), search the paper title in our metadata table and then leave a message in the corresponding cell of the table.
- If you would like to add some paper, please fill in the following table through
ISSUE:
| Title | Link | Code | Venue | Classification | Model | Comment |
|---|---|---|---|---|---|---|
| This is a title | paper.com | github | bb'23 | A1. Jailbreak | LLM | Agent |
- [2025.01.09] 🎂 Happy 1st Birthday to Awesome-LM-SSP! Keep Going! 💪
- [2024.01.09] 🚀 LM-SSP is released!
- Book (3)
- Competition (5)
- Leaderboard (5)
- Toolkit (13)
- Survey (40)
- Paper (2327)
- A. Safety (1175)
- A0. General (30)
- A1. Jailbreak (528)
- A2. Alignment (145)
- A3. Deepfake (92)
- A4. Ethics (8)
- A5. Fairness (60)
- A6. Hallucination (116)
- A7. Prompt Injection (110)
- A8. Toxicity (86)
- B. Security (451)
- B0. General (16)
- B1. Adversarial Examples (105)
- B2. Agent (130)
- B3. Poison & Backdoor (175)
- B4. Side-Channel (1)
- B5. System (24)
- C. Privacy (701)
- C0. General (54)
- C1. Contamination (17)
- C2. Data Reconstruction (63)
- C3. Membership Inference Attacks (65)
- C4. Model Extraction (14)
- C5. Privacy-Preserving Computation (128)
- C6. Property Inference Attacks (7)
- C7. Side-Channel (10)
- C8. Unlearning (68)
- C9. Watermark & Copyright (275)
- A. Safety (1175)
-
Organizers: Tianshuo Cong (丛天硕), Xinlei He (何新磊), Zhengyu Zhao (赵正宇), Yugeng Liu (刘禹更), Delong Ran (冉德龙)
-
This project is inspired by LLM Security, Awesome LLM Security, LLM Security & Privacy, UR2-LLMs, PLMpapers, EvaluationPapers4ChatGPT

