The latency of a completion request is mostly influenced by two factors: the model and the number of tokens generated. Please read our updated documentation for guidance on improving latencies.
How can I improve latencies using OpenAI text generation models like GPT-4?
Updated over 10 months ago