Your First RAG Pipeline
Installation
# you can use a Conda environment
pip install --extra-index-url "https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/" python-dotenv gllm-core gllm-generation gllm-inference gllm-pipeline gllm-retrieval gllm-datastore# you can use a Conda environment
pip install --extra-index-url "https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/" python-dotenv gllm-core gllm-generation gllm-inference gllm-pipeline gllm-retrieval gllm-datastoreFOR /F "tokens=*" %T IN ('gcloud auth print-access-token') DO pip install --extra-index-url "https://oauth2accesstoken:%T@glsdk.gdplabs.id/gen-ai-internal/simple/" python-dotenv gllm-core gllm-generation gllm-inference gllm-pipeline gllm-retrieval gllm-datastoreProject Setup
1
<project-name>/
├── data/
│ ├── <index>/... # preset data index folder
│ ├── chroma.sqlite3 # preset database file
│ ├── imaginary_animals.csv # sample data
├── modules/
│ ├── retriever.py
│ └── response_synthesizer.py
├── .env
├── indexer.py
└── pipeline.py2
EMBEDDING_MODEL="text-embedding-3-small"
LANGUAGE_MODEL="openai/gpt-5-nano"
OPENAI_API_KEY="<YOUR_OPENAI_API_KEY>"1) Index Your Data
1
2
2) Build Core Components of Your Pipeline
Create the Retriever
Create the Response Synthesizer
3) Build the Pipeline
1
2
3
4
4) Run the Pipeline
1
2
3
Last updated
Was this helpful?