Controlling the length of OpenAI model responses

Learn how to influence OpenAI models to generate outputs of a specific length

Updated over a week ago

The main way to control the length of a model response is with the max_tokens parameter. In the Playground, this setting is the “Maximum length”. The total max length depends on the specific model used for the request.

Give instructions

Providing instructions to generate the desired output length, such as a specific number of items in a list, can influence how long the model response is. This can be done in the user message or system message.

Add examples of a specific length

OpenAI models like GPT-4 are great at recognizing patterns and will consider the length of examples given when generating responses. By providing an example, or multiple examples, with the desired output length, you can give needed context about the expected length.

Strategic Stop Sequences

Another way to control the length out outputs is to use stop sequences. In the example below, the stop sequences are “###” and “6.”. If the model attempts to generate a sixth list item, it will run into the “6.” stop sequence and stop generating text.

Note: There is not currently a way to set a minimum number of tokens.

Did this answer your question?