This project uses a Retrieval-Augmented Generation (RAG) model to answer questions based on a custom corpus created from PDF and DOCX files.
- Clone the repository.
- Install dependencies:
pip install -r requirements.txt
. - Run the extraction:
sh scripts/run_extraction.sh
. - Generate answers using the RAG model.
data/
: Contains input files.corpus/
: Stores processed corpus data.src/
: Source code.models/
: Trained models.notebooks/
: Jupyter notebooks for experimentation.tests/
: Unit tests.scripts/
: Helper scripts.requirements.txt
: Dependencies list.