본문 바로가기
728x90

JailBreaking3

[논문리뷰] White-box Multimodal Jailbreaks Against Large Vision-Language Models Wang, R., Ma, X., Zhou, H., Ji, C., Ye, G., & Jiang, Y. (2024). White-box Multimodal Jailbreaks Against Large Vision-Language Models. ACM Multimedia.https://arxiv.org/abs/2405.17894 White-box Multimodal Jailbreaks Against Large Vision-Language ModelsRecent advancements in Large Vision-Language Models (VLMs) have underscored their superiority in various multimodal tasks. However, the adversarial .. 2024. 12. 27.
[논문리뷰] Visual Adversarial Examples Jailbreak Aligned Large Language Models Qi, X., Huang, K., Panda, A., Henderson, P., Wang, M., & Mittal, P. (2024). Visual Adversarial Examples Jailbreak Aligned Large Language Models. Proceedings of the AAAI Conference on Artificial Intelligence, 38(19), 21527-21536. https://doi.org/10.1609/aaai.v38i19.30150 https://arxiv.org/abs/2306.13213 Visual Adversarial Examples Jailbreak Aligned Large Language ModelsRecently, there has been a .. 2024. 12. 26.
[논문리뷰] Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization Zhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, and Minlie Huang. 2024. Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 8865–8887, Bangkok, Thailand. Association for Computational Linguistics. https://aclanthology.org/2024... 2024. 12. 24.
728x90