File:RLHF diagram.svg

From Wikimedia Commons, the free media repository
Jump to navigation Jump to search

Original file(SVG file, nominally 512 × 366 pixels, file size: 177 KB)

Captions

Captions

High-level overview of reinforcement learning from human feedback

Summary[edit]

Description
English: This is a high-level overview of reinforcement learning from human feedback, including training an initial supervised model, collecting human feedback, training a reward model, and using it to align the initial model.
Date
Source Own work
Author PopoDameron

Licensing[edit]

I, the copyright holder of this work, hereby publish it under the following license:
w:en:Creative Commons
attribution share alike
This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.
You are free:
  • to share – to copy, distribute and transmit the work
  • to remix – to adapt the work
Under the following conditions:
  • attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
  • share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.

File history

Click on a date/time to view the file as it appeared at that time.

Date/TimeThumbnailDimensionsUserComment
current20:20, 1 April 2024Thumbnail for version as of 20:20, 1 April 2024512 × 366 (177 KB)PopoDameron (talk | contribs)Clarified relationship between RM and aligned model & added description to the aligned model
04:13, 14 March 2024Thumbnail for version as of 04:13, 14 March 2024512 × 366 (160 KB)PopoDameron (talk | contribs)Uploaded own work with UploadWizard

There are no pages that use this file.

File usage on other wikis

Metadata