🚀Getting Started
Introduction
This tutorial will guide you step-by-step on how to install the GenAI Evaluator SDK and run your first evaluation.
Installation
Run the following command to install
pip install --extra-index-url "https://oauth2accesstoken:$(gcloud auth print-access-token)@glsdk.gdplabs.id/gen-ai-internal/simple/" "gllm-evals[deepeval,langchain,ragas]"
Environment Setup
Set a valid language model credential as an environment variable.
In this example, let's use an OpenAI API key.
Get an OpenAI API key from OpenAI Console.
export OPENAI_API_KEY="sk-..."
Running Your First Evaluation
In this tutorial, we will evaluate RAG pipeline output.
1
Create a script called eval.py
import asyncio
import os
from gllm_evals.evaluator.geval_generation_evaluator import GEvalGenerationEvaluator
from gllm_evals.types import RAGData
async def main():
evaluator = GEvalGenerationEvaluator(
model="openai/gpt-4.1",
model_credentials=os.getenv("OPENAI_API_KEY")
)
data = RAGData(
query="What is the capital of France?",
expected_response="Paris",
generated_response="New York",
retrieved_context="Paris is the capital of France.",
)
result = await evaluator.evaluate(data)
print(result)
if __name__ == "__main__":
asyncio.run(main())
2
Run the script
python eval.py
3
The evaluator will generate a response for the given input, e.g.:
{
'geval_generation_evaluator': {
'relevancy_rating': 'bad',
'possible_issues': ['Retrieval Issue', 'Generation Issue'],
'score': 0,
'completeness': {
'score': 1,
'reason': "The expected output contains one substantive statement: 'Paris' as the capital of France. The actual output, 'New York', does not match this statement and is a critical error regarding the key information."
},
'groundedness': {
'score': 3,
'reason': "The response 'New York' is not mentioned in the retrieved context and is factually incorrect since the context clearly states that Paris is the capital of France. This is a critical factual mistake, rendering the answer fully unsupported by the context or the question."
},
'redundancy': {
'score': 1,
'reason': 'The response provides a single, incorrect answer without any repetition, restatement, or elaboration. Only one idea is presented, making it concise and to the point, regardless of accuracy.'
}
}
}
Congratulations! You have successfully run your first evaluation
Next Steps
You're now ready to start using our evaluators. We offer several prebuilt evaluators to get you started:
Looking for something else? Build your own custom evaluator here.
Last updated