Evaluating Chunking Strategies For Retrieval-Augmented Generation in Oil and Gas Enterprise Documents

ArXi:2603.24556v1 Announce Type: cross Retrieval-Augmented Generation (RAG) has emerged as a framework to address the constraints of Large Language Models (LLMs). Yet, its effectiveness fundamentally hinges on document chunking - an often-overlooked determinant of its quality. This paper presents an empirical study quantifying performance differences across four chunking strategies: fixed-size sliding window, recursive, breakpoint-based semantic, and structure-aware. We evaluated these methods using a.