논문리뷰(10)
-
[논문리뷰] Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey
Liu, X., Cui, X., Li, P., Li, Z., Huang, H., Xia, S., Zhang, M., Zou, Y., & He, R. (2024). Jailbreak Attacks and Defenses against Multimodal Generative Models: A Survey. https://arxiv.org/abs/2411.09259 Jailbreak Attacks and Defenses against Multimodal Generative Models: A SurveyThe rapid evolution of multimodal foundation models has led to significant advancements in cross-modal understanding a..
2024.12.31 -
[논문리뷰] BLUE SUFFIX: REINFORCED BLUE TEAMING FOR VISION-LANGUAGE MODELS AGAINST JAILBREAK ATTACKS
Zhao, Y., Zheng, X., Luo, L., Li, Y., Ma, X., & Jiang, Y. (2024). BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak Attacks. ArXiv, abs/2410.20971.https://arxiv.org/abs/2410.20971 BlueSuffix: Reinforced Blue Teaming for Vision-Language Models Against Jailbreak AttacksDespite their superb multimodal capabilities, Vision-Language Models (VLMs) have been shown to be v..
2024.12.30 -
[논문리뷰] White-box Multimodal Jailbreaks Against Large Vision-Language Models
Wang, R., Ma, X., Zhou, H., Ji, C., Ye, G., & Jiang, Y. (2024). White-box Multimodal Jailbreaks Against Large Vision-Language Models. ACM Multimedia.https://arxiv.org/abs/2405.17894 White-box Multimodal Jailbreaks Against Large Vision-Language ModelsRecent advancements in Large Vision-Language Models (VLMs) have underscored their superiority in various multimodal tasks. However, the adversarial ..
2024.12.27 -
[논문리뷰] Visual Adversarial Examples Jailbreak Aligned Large Language Models
Qi, X., Huang, K., Panda, A., Henderson, P., Wang, M., & Mittal, P. (2024). Visual Adversarial Examples Jailbreak Aligned Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19), 21527-21536. https://doi.org/10.1609/aaai.v38i19.30150 https://arxiv.org/abs/2306.13213 Visual Adversarial Examples Jailbreak Aligned Large Language ModelsRecently, there has been a ..
2024.12.26 -
[논문리뷰] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
Zhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, and Minlie Huang. 2024. Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8865–8887, Bangkok, Thailand. Association for Computational Linguistics. https://aclanthology.org/2024...
2024.12.24 -
[논문리뷰] RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture
Gupta, A., Shirgaonkar, A., Balaguer, A. D. L., Silva, B., Holstein, D., Li, D., ... & Benara, V. (2024). RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture. arXiv preprint arXiv:2401.08406.https://arxiv.org/abs/2401.08406 RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on AgricultureThere are two common ways in which developers are incorporating proprietary and..
2024.10.16