RLHF – The secret sauce of ChatGPT

The technical talk will cover how OpenAI managed to improve the quality of LLM completions from Base models to instruction tuned models and finally Assistant models. I will cover the basics of reinforcement learning and how this turned out to be an essential component of LLM creation. I will then explain the underlying PPO algorithm used by OpenAI and other recent techniques that allow anyone to benefit from RLHF without the hassles of dealing with RL.

Grab your ticket for a unique experience of inspiration, meeting and networking for the AI & data science industry

Book your tickets at the earliest. We have a hard stop at 1200 passes.

Note: Ticket Pricing to change at any time.

  • Early Bird Passes

    Available from 12th Apr till 11th Jul 2025
  • All access, 3 day passes
  • Conference Lunch on all 3 days
  • Group Discount available
  • 15000
  • Regular Passes

    Available from 12th July to 22nd Aug 2025
  • All access, 3 day passes
  • Conference Lunch on all 3 days
  • Group Discount available
  • 25000
  • Late Passes

    AVAILABLE FROM 23rd Aug 2025
  • All access, 3 day passes
  • Conference Lunch on all 3 days
  • No Group Discount available
  • 35000

Explore More About Cypher

Our Speakers

Our Agenda

Our Partners

Join thousands of attendees, learning and networking at Cypher.

Register for Cypher

We offer Group Discount

Share

RLHF – The secret sauce of ChatGPT