πGetting Started
Introduction
This tutorial will guide you step-by-step on how to install the GenAI Evaluator SDK and run your first evaluation.
Installation
Run the following command to install
pip install --extra-index-url "https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/" "gllm-evals[deepeval,langchain,ragas]"Step 1: Add the gen-ai-internal source to your pyproject.toml
poetry source add gen-ai-internal "https://asia-southeast2-python.pkg.dev/gdp-labs/gen-ai-internal/simple/" --priority supplementalStep 2: Configure the authentication
poetry config http-basic.gen-ai-internal oauth2accesstoken "$(gcloud auth print-access-token)"Step 3: Add to projects
poetry add "gllm-evals[deepeval,langchain,ragas]"Step 1: Add the gen-ai-internal source to your pyproject.toml
poetry source add --priority=explicit gen-ai https://glsdk.gdplabs.id/gen-ai/simpleStep 2: Configure the authentication
poetry config http-basic.gen-ai oauth2accesstoken "$(gcloud auth print-access-token)"Step 3: Add to projects
poetry add --source gen-ai "gllm-evals-binary[deepeval,langchain,ragas]"Environment Setup
Set a valid language model credential as an environment variable.
In this example, let's use an OpenAI API key.
Get an OpenAI API key from OpenAI Console.
Running Your First Evaluation
In this tutorial, we will evaluate RAG pipeline output.
Create a script called eval.py
Run the script
The evaluator will generate a response for the given input, e.g.:
Congratulations! You have successfully run your first evaluation
Recommendation
If you want to run an end-to-end evaluation, use the evaluate() convenience function instead of the step-by-step commands above.
It will automatically handle experiment tracking (via the Experiment Tracker) and integrates results into your existing Dataset, so you donβt have to wire these pieces together manually.
Next Steps
You're now ready to start using our evaluators. We offer several prebuilt evaluators to get you started:
Looking for something else? Build your own custom evaluator here.
*All fields are optional and can be adjusted depending on the chosen metric.
Last updated