본문 바로가기
728x90

LLM17

[논문 리뷰] RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture Gupta, A., Shirgaonkar, A., Balaguer, A. D. L., Silva, B., Holstein, D., Li, D., ... & Benara, V. (2024). RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture. arXiv preprint arXiv:2401.08406.https://arxiv.org/abs/2401.08406 RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on AgricultureThere are two common ways in which developers are incorporating proprietary and.. 2024. 10. 16.
[논문리뷰] Fine-Tuning or Retrieval? Comparing Knowledge Injections in LLMs Ovadia, O., Brief, M., Mishaeli, M., & Elisha, O. (2023). Fine-tuning or retrieval? comparing knowledge injection in llms. arXiv preprint arXiv:2312.05934.https://arxiv.org/abs/2312.05934 Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMsLarge language models (LLMs) encapsulate a vast amount of factual information within their pre-trained weights, as evidenced by their ability to an.. 2024. 10. 15.
[논문리뷰] Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning Link: https://aclanthology.org/2024.findings-acl.958/ ACL2024 findings에 accept된 paper이다. Teacher-student model.. 모델의 distillation에서 자주 보던 용어다. 여기서도 같은 의미로 사용되는데 모델의 경량화와 함께 따라오는 학습 속도의 개선, 그러면서도 성능 유지를 위해서 이러한 방식을 채택한다. 이 논문에서는 data의 효율성을 위해 새로운 데이터의 수집 없이도 student 모델의 성능향상이 가능하다고 주장한다.내용이 엄청 쉽지는 않았어서 나름 이해하기 쉽게 작성해봤습니다.Distillation은 잘 모르는 분야기도 하고, 열심히 이해해봤는데 틀린 부분이 있을 수 있어요..ㅠㅠAbstractProblem: .. 2024. 10. 15.
[논문리뷰] LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-tuning of Large Language Models Zhiqiang Hu, Lei Wang, Yihuai Lan, Wanyu Xu, Ee-Peng Lim, Lidong Bing, Xing Xu, Soujanya Poria, and Roy Lee. 2023. LLM-Adapters: An Adapter Family for Parameter-Efficient Fine-Tuning of Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 5254–5276, Singapore. Association for Computational Linguistics. https://arxiv.org/abs/2304... 2024. 10. 14.
[논문리뷰] Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark Chanjun Park, Hyeonwoo Kim, Dahyun Kim, SeongHwan Cho, Sanghoon Kim, Sukyung Lee, Yungi Kim, and Hwalsuk Lee. 2024. Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark. In Proceedings of the 62nd Annual Meeting of the Association for Computational Liguistics (Volume 1: Long Papers), pages 3220-3234, Bangkok, Thailand. Association for Computational Linguistics.. 2024. 10. 7.
[논문리뷰] What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization? What Language Model Architecture and Pretraining Objectvie Work Best for Zero-Shot Generalization, International Conference on Machine Learning, PMLR (Proceedings of Machine Learning Research), 2022.https://arxiv.org/abs/2204.05832 What Language Model Architecture and Pretraining Objective Work Best for Zero-Shot Generalization?Large pretrained Transformer language models have been shown to exhi.. 2024. 9. 30.
728x90