Skip to main content

Using logit bias to alter token probability with the OpenAI API

Learn how to use the logit bias parameter to modify model outputs

Updated over a year ago

Logit_bias is an optional parameter that modifies the likelihood of specified tokens appearing in a model generated output.

This parameter accepts a JSON object that maps tokens to an associated bias value from -100 (which will in most cases block that token from being generated) to 100 (exclusive selection of the token which makes it more likely to be generated). Moderate values like -1 and 1 will change the probability of a token being selected to a lesser degree.

Since the parameter takes in tokens, not text, you’ll want to use a tokenizer tool to convert text to token IDs. Let’s go through a few examples.

Example 1: Remove 'time'

If we call the Completions endpoint with the prompt “Once upon a,” the completion is very likely going to start with “ time.”

The word “time” tokenizes to the ID 2435 and the word “ time” (which has a space at the start) tokenizes to the ID 640. We can pass these through logit_bias with -100 to ban them from appearing in the completion, like so:

completion = client.chat.completions.create( 
model="gpt-3.5-turbo",
messages=[{"role": "system", "content": "You finish user's sentences."},
"role": "user", "content": "Once upon a"} ]
logit_bias={2435:-100, 640:-100}
)


Now, the prompt “Once upon a” generates the completion “midnight dreary, while I pondered, weak and weary.”

Notice that the word “time” is nowhere to be found, because we’ve effectively banned that token using logit_bias.

Example 2: Give direction with targeted logit bias values

Let’s go through another example, using a recipe generator prompt.

Many recipes suggest using pots, but suppose we don't have a pot. We'll want to remove the word pot from being generated as part of our completion. 'Pot' tokenizes to 1787 so we can remove it from our generation by setting our logit_bias as below.

logit_bias={1787:-100}

Now, our completion might include the word “saucepan” instead. Perfect!

Example 3: Increase the odds of a word appearing

Suppose that we want to increase the likelihood of a word appearing.

For example, maybe we’re running a site that offers recipes that you can make with a microwave, so we want to be sure that the word “microwave” appears in the recipe. Microwave tokenizes to ID 27000. We can increase the likelihood that this token appears by setting a positive logit_bias, like so:

logit_bias={27000:5}

Now, our completion is more likely to include the word 'microwave'.

We set the logit_bias to 5, as we found that setting logit_bias to 1 often did not result in the word “microwave” appearing in the completion, while higher logit_bias values like 10 resulted in the word “ microwave” appearing in the completion too often.

Did this answer your question?