Skip to content

amayuelas/multi-agent-attack

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Multi Agent Attack

Code for the paper MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate

Attack Description

Installation

Main libraries required (can be installed with pip):

transformers
datasets
pandas
numpy
openai

To use OpenAI models, it reads the API key from the environment variable OPENAI_API_KEY. You can add it to your environment with the following command:

cond env config vars set OPENAI_API_KEY='your key'

Datasets

The following datasets have been used in the experiments:

  • TruthfulQA
  • MMLU
  • MedMCQA
  • Scalr

Datasets can be downloaded from the following link: data download
The folder is expected to be saved in the directory: multiagent_debate/data

Running instruction

Debate

  • main.py: It generates the general debate for all datasets

Adversary

  • advers.py: It generates the debate for the adversaries (currently OpenAI)
  • advers_optim.py: It generates the debate for the optimized attacker

Evaluation

  • evaluate.py: It runs the evaluation for the all the files. Mode: [majority/judge]

Citations

@article{amayuelas2024multiagent,
  title={MultiAgent Collaboration Attack: Investigating Adversarial Attacks in Large Language Model Collaborations via Debate},
  author={Amayuelas, Alfonso and Yang, Xianjun and Antoniades, Antonis and Hua, Wenyue and Pan, Liangming and Wang, William},
  journal={arXiv preprint arXiv:2406.14711},
  year={2024}
}

About

MutliAgent Attack

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published