How RAG WorksDocuments are split into chunksEach chunk is converted to a vector embeddingOn query, similar chunks are retrievedLLM generates an answer using the retrieved context