What is it?
GPT-4 Turbo is our latest generation model. It’s more capable, has an updated knowledge cutoff of April 2023 and introduces a 128k context window (the equivalent of 300 pages of text in a single prompt). The model is also 3X cheaper for input tokens and 2X cheaper for output tokens compared to the original GPT-4 model. The maximum number of output tokens for this model is 4096.
How can I get access to it?
Anyone with an OpenAI API account and existing GPT-4 access can use this model. The most recent version of model can be accessed by passing gpt-4-turbo
as the model name in the API. You can read more about the differences across GPT-4 Turbo dated models in our developer documentation.
What are the rate limits? Can I get an increase?
Rate limits are dependent on your usage tier. You can find which usage tier you are in on the Limits settings page.