Google Generative AI Evaluation Service
A service to evaluate the performance of Generative AI Models using metrics like BLEU or ROUGE among others.
5 min readNov 28, 2023
The evaluation service allows the evaluation of the PaLM 2 (text-bison) foundation and tuned models. This evaluation uses a set of metrics against an evaluation dataset you provided.
The process involves creating an evaluation dataset containing prompts and their ideal responses (ground truth pairs).