Generative AI - How to Fine Tune LLMs

How to tune LLMs in Generative AI Studio. Vertex AI allows you to fine-tune PaLM models for text, chat, code, and embeddings intuitively and easily.

Sascha Heyer

--

With all its amazing techniques, Prompt Engineering is a fantastic method that works for many use cases. And I always recommend starting with prompt engineering before thinking about fine-tuning.

Still, fine-tuning with all its amazing benefits can be useful:

  • To reduce the context length and therefore use fewer tokens, which helps to reduce the overall costs. E.g. no examples / few shots in your prompt are needed anymore.
  • To reproduce a behavior or output that is hard to achieve with prompt engineering. This is particularly useful if you need highly consistent responses.

What are we building in this article?

Let’s face it, sarcasm, and I have a complicated relationship. More often than not, I find myself scratching my head, trying to figure out if someone’s being humorous or just plain serious.

That’s why it’s finally time to fine-tune two large language models with Vertex AI Generative AI using Google's PaLM 2 model to master the art of sarcasm classification and generation.

Fine-tuning an LLM with Vertex AI Generative AI is an easy process overall. I hope you enjoy it as much as I do.

Fine Tuning Methods

Vertex AI has built-in support for two different fine-tuning methods.

  1. Supervised
  2. RLHF uses human feedback and reinforcement learning

I always recommend starting with the supervised approach, except you already have the type of data required for a reinforcement learning fine-tuning method. RLHF can be useful for more complex use cases.

Supervised

Supervised fine-tuning is the most accessible approach, as the required training data structure is relatively straightforward to create.

Let me show you an example of the sarcasm generation use case that we fine-tune in this article.

{"input_text": "Write a sarcasm answer for the following text. 
Text: The only time the office is quiet is when I'm not there.",
"output_text": "Perfect, it's…

--

--

Sascha Heyer

Hi, I am Sascha, Senior Machine Learning Engineer at @DoiT. Support me by becoming a Medium member 🙏 bit.ly/sascha-support