• 3 Posts
  • 11 Comments
Joined 1 year ago
cake
Cake day: July 7th, 2023

help-circle








  • Can SFT be used on partial generations? What I mean by a “steer” is a correction to only a portion, and not even the end, of model output.

    For example, a “bad” partial output might be:

    <assistant> Here are four examples:
    1. High-quality example 1
    2. Low-quality example 2
    

    and the “steer” might be:

    <assistant> Here are four examples:
    1. High-quality example 1
    2. High-quality example 2
    

    but the full response will eventually be:

    <assistant> Here are four examples:
    1. High-quality example 1
    2. High-quality example 2
    3. High-quality example 3
    4. High-quality example 4
    

    The corrections don’t include the full output.