Logit_bias is an optional parameter that modifies the likelihood of specified tokens appearing in a model generated output.
This parameter accepts a JSON object that maps tokens to an associated bias value from -100 (which will in most cases block that token from being generated) to 100 (exclusive selection of the token which makes it more likely to be generated). Moderate values like -1 and 1 will change the probability of a token being selected to a lesser degree.
Since the parameter takes in tokens, not text, you’ll want to use a tokenizer tool to convert text to token IDs. Let’s go through a few examples.
Example 1: Remove 'time'
If we call the Completions endpoint with the prompt “Once upon a,” the completion is very likely going to start with “ time.”
The word “time” tokenizes to the ID 2435 and the word “ time” (which has a space at the start) tokenizes to the ID 640. We can pass these through logit_bias with -100 to ban them from appearing in the completion, like so:
completion = client.chat.completions.create(
model="gpt-3.5-turbo",
messages=[{"role": "system", "content": "You finish user's sentences."},
"role": "user", "content": "Once upon a"} ]
logit_bias={2435:-100, 640:-100}
)
Now, the prompt “Once upon a” generates the completion “midnight dreary, while I pondered, weak and weary.”
Notice that the word “time” is nowhere to be found, because we’ve effectively banned that token using logit_bias
.
Example 2: Give direction with targeted logit bias values
Let’s go through another example, using a recipe generator prompt.
Many recipes suggest using pots, but suppose we don't have a pot. We'll want to remove the word pot from being generated as part of our completion. 'Pot' tokenizes to 1787 so we can remove it from our generation by setting our logit_bias
as below.
logit_bias={1787:-100}
Now, our completion might include the word “saucepan” instead. Perfect!
Example 3: Increase the odds of a word appearing
Suppose that we want to increase the likelihood of a word appearing.
For example, maybe we’re running a site that offers recipes that you can make with a microwave, so we want to be sure that the word “microwave” appears in the recipe. Microwave tokenizes to ID 27000. We can increase the likelihood that this token appears by setting a positive logit_bias, like so:
logit_bias={27000:5}
Now, our completion is more likely to include the word 'microwave'.
We set the logit_bias
to 5, as we found that setting logit_bias
to 1 often did not result in the word “microwave” appearing in the completion, while higher logit_bias
values like 10 resulted in the word “ microwave” appearing in the completion too often.