Vertex AI Grounding Large Language Models
Grounding allows Google’s large Language models to use your specific data to produce more accurate and relevant responses.
Grounding is particularly useful to reduce hallucinations and answer questions based on specific information the model wasn’t trained on. This approach is also called RAG (Retrieval Augmentation Generation).
Implementing a Grounding architecture can take some time. In fact, I have written a dedicated article on how you can implement your own custom grounding solution.
Generative AI - Document Retrieval and Question Answering with LLMs
Apply LLMs to your domain-specific data
Again, thanks to Google, you can now rely on Google Grounding instead of implementing a custom solution (at least for many standard use cases).
Grounding with Vertex AI
Grounding in Vertex AI is based on Vertex AI Search. Your PaLM model is accessing Vertex AI Search before processing your prompt and receives relevant documents.
That means we have two different products involved
- Vertex AI PaLM API
That provides the large language model, either text or chat.
- Vertex AI Search
That provides our grounding data in an efficient way.
Jump Directly to the Notebook and Code
All the code for this article is ready to use in a Google Colab notebook. If you have questions, don’t hesitate to contact me via LinkedIn.
Get the Grounding working
To combine the grounding with your PaLM Bison or Unicorn model, you first need to create a Vertex AI Search data store.