https://github.com/computationalprivacy/copyright-traps
Copyright traps (see Meeus et al. (ICML 2024)) are unique, synthetically generated sequences who have been included into the training dataset of CroissantLLM. This dataset allows for the evaluation of Membership Inference Attacks (MIAs) using CroissantLLM as target model, where the goal is to infer whether a certain trap sequence was either included in or excluded from the training data.
This dataset contains non-member (label=0
) and member (label=1
) trap sequences, which have been generated using this code and by sampling text from LLaMA-2 7B while controlling for sequence length and perplexity. The dataset contains splits according to seq_len_{XX}_n_rep_{YY}
where sequences of XX={25,50,100}
tokens are considered and YY={10, 100, 1000}
number of repetitions for member sequences. Each dataset also contains the ‘perplexity bucket’ for each trap sequence, where the original paper showed that higher perplexity sequences tend to be more vulnerable.
Note that for a fixed sequence length, and across various number of repetitions, each split contains the same set of non-member sequences (n_rep=0
). Also additional non-members generated in exactly the same way are provided here, which might be required for some MIA methodologies making additional assumptions for the attacker.