What is the difference between prompt tokens and completion tokens?

As of March 11, 2025, we’ve released the building blocks of our new Agents platform. For details, see our API docs for our Responses API, Tools including Web Search, File Search, and Computer Use, and our Agents SDK with Tracing.

Prompt tokens are the tokens that you input into the model. This is the number of tokens in your prompt.

Completion tokens are any tokens that the model generates in response to your input. For a standard request, this is the number of tokens in the completion.

Most models we offer have both limits on the number of tokens that they can take in (prompt tokens) and on the number of tokens they an output (completion or samples tokens).

This also includes any tokens generated when using a higher value of best_of or n. For example, if you are generating 3 candidate completions using best_of = 3, the number of sampled tokens will be at most 3 * max_tokens.

You can read more about managing tokens in our text generation guide.

What is the difference between prompt tokens and completion tokens?

Was this article helpful?