RLHF Without Humans: Training LLMs Using Automated Feedback From Another LLM

Author: Jon-Paul Walton