Prompt tokens are the tokens that you input into the model. This is the number of tokens in your prompt.
Sampled tokens are any tokens that the model generates in response to your input. For a standard request, this is the number of tokens in the completion.
This also includes any tokens generated when using a higher value of best_of
or n
. For example, if you are generating 3 candidate completions using best_of = 3
, the number of sampled tokens will be at most 3*max_tokens
.