OpenAI Rate Limit: Understanding And Optimizing Your API Usage

OpenAI has implemented rate limits on API usage to ensure fair usage and to prevent abuse of their services.

These limits vary based on the plan and usage, with different limits for different types of endpoints.

OpenAI has implemented rate limits on API usage to ensure fair usage and to prevent abuse of their services.

These limits vary based on the plan and usage, with different limits for different types of endpoints.

For Free trial users, the organization-level rate limit is set to 20 requests per minute and 150,000 tokens per minute.

This means that a user can make up to 20 API requests per minute, and the total number of tokens generated by those requests cannot exceed 150,000 per minute.

For Pay-as-you-go users, the rate limits are higher. In the first 48 hours after signing up for the plan, users are limited to 60 requests per minute and 250,000 davinci tokens per minute.

After the first 48 hours, the limit increases to 3,000 requests per minute and 250,000 davinci tokens per minute.

Additionally, there are different limits for specific endpoints such as Text Completion, Codex, Edit, and Image endpoints.

For example, the Codex endpoint limit is 20 requests per minute and 40,000 tokens per minute.

If a user exceeds the rate limit, a rate limit error will occur, and the API requests will be blocked until the rate limit resets.

To help users navigate these limits, OpenAI has created a helpful notebook that outlines the rate limits and offers best practices for handling them.

It’s worth noting that the limits for DALL·E API is 50 images / minute for all user types.

If you have any questions about the rate limits, you can visit the OpenAI Community Forum or contact the team via on-site chat for more information.

Below is the table summarizing the OpenAI rate limits for each plan:

PlanRequests/minuteTokens/minute
Free Trial20150,000
Pay-as-you-go (48hrs)60250,000
Pay-as-you-go (after)3,000250,000
Codex2040,000
Edit20150,000
Image50N/A
DALL·E API50N/A

Note: the limits for specific endpoints such as Codex, Edit, and Image endpoints are specific to those endpoints and do not affect the overall limits for the plan.

It’s also worth noting that the tokens limits are for Davinci tokens, and proportionally more for smaller models.

Please keep in mind that these limits are subject to change and may be updated by OpenAI.

If you have any questions about the rate limits, you can visit the OpenAI Community Forum or contact the team via on-site chat for more information