Hypothesis

4 Matching Annotations

Nov 2025
arxiv.org arxiv.org

Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples

4
1. tonz 10 Nov 2025
  
  in Public
  
  we demonstrate the same dynamics for poisoning during fine-tuning.
  
  This poisoning also plays out the same way when fine-tuning. This adds another attack vector also for existing models when they are used.
  
  llms backdooring
2. tonz 10 Nov 2025
  
  in Public
  
  his work demonstrates for the first time that poisoning attacks instead require a near-constant number of documents regardless of dataset size. We conduct the largest pretraining poisoning experiments to date, pretraining models from 600M to 13B parameters on chinchilla-optimal datasets (6B to 260B tokens). We find that 250 poisoned documents similarly compromise models across all model and dataset sizes, despite the largest models training on more than 20 times more clean data
  
  The paper shows that it's not a percentage of training data that needs to be poisoned for an attack, but an almost fixed number of documents (250!) which is enough across large models too.
  
  llms backdooring
3. tonz 10 Nov 2025
  
  in Public
  
  Existing work has studied pretraining poisoning assuming adversaries control a percentage of the training corpus. However, for large models, even small percentages translate to impractically large amounts of data.
  
  It was previously assumed that a certain percentage of data needed to be 'poisoned' to attack an LLM. This becomes impractical quickly with the size of LLMs.
  
  llms backdooring
4. tonz 10 Nov 2025
  
  in Public
  
  Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples in Zotero
  
  llms backdooring
Visit annotations in context

Tags

backdooring

llms

Annotators

tonz

URL

arxiv.org/abs/2510.07192

Tags

Annotators

URL