File:RLHF diagram.svg

Size of this PNG preview of this SVG file: 512 × 366 pixels. Other resolutions: 320 × 229 pixels | 640 × 458 pixels | 1,024 × 732 pixels | 1,280 × 915 pixels | 2,560 × 1,830 pixels.

Original file ‎(SVG file, nominally 512 × 366 pixels, file size: 177 KB)

Captions

English

High-level overview of reinforcement learning from human feedback

Summary[edit]

DescriptionRLHF diagram.svg	English: This is a high-level overview of reinforcement learning from human feedback, including training an initial supervised model, collecting human feedback, training a reward model, and using it to align the initial model.
Date	14 March 2024
Source	Own work
Author	PopoDameron

Licensing[edit]

I, the copyright holder of this work, hereby publish it under the following license:

This file is licensed under the Creative Commons Attribution-Share Alike 4.0 International license.

You are free:

to share – to copy, distribute and transmit the work
to remix – to adapt the work

Under the following conditions:

attribution – You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
share alike – If you remix, transform, or build upon the material, you must distribute your contributions under the same or compatible license as the original.

File history

Click on a date/time to view the file as it appeared at that time.

	Date/Time	Thumbnail	Dimensions	User	Comment
current	20:20, 1 April 2024		512 × 366 (177 KB)	PopoDameron (talk \| contribs)	Clarified relationship between RM and aligned model & added description to the aligned model
	04:13, 14 March 2024		512 × 366 (160 KB)	PopoDameron (talk \| contribs)	Uploaded own work with UploadWizard

You cannot overwrite this file.

File usage on Commons

There are no pages that use this file.

File usage on other wikis

The following other wikis use this file:

Usage on en.wikipedia.org
- Reinforcement learning from human feedback
Usage on fa.wikipedia.org
- پیش‌نویس:تقویت یادگیری از بازخورد انسانی
Usage on ru.wikipedia.org
- Обучение с подкреплением на основе отзывов людей
Usage on sr.wikipedia.org
- Podržano učenje iz ljudskih povratnih informacija

Metadata

This file contains additional information such as Exif metadata which may have been added by the digital camera, scanner, or software program used to create or digitize it. If the file has been modified from its original state, some details such as the timestamp may not fully reflect those of the original file. The timestamp is only as accurate as the clock in the camera, and it may be completely wrong.

Width	100%
Height	100%

Structured data

File:RLHF diagram.svg

Captions

Captions

Summary[edit]

Licensing[edit]

File history

File usage on Commons

File usage on other wikis

Metadata

Structured data

Items portrayed in this file

depicts

Reinforcement Learning from Human Feedback

creator

some value

copyright status

copyrighted

copyright license

Creative Commons Attribution-ShareAlike 4.0 International

inception

14 March 2024

media type

image/svg+xml

source of file

original creation by uploader

Navigation menu

File:RLHF diagram.svg

Captions

Captions

Summary[edit]

Licensing[edit]

File history

File usage on Commons

File usage on other wikis

Metadata

Navigation menu

Search