Hey everyone! I'm new to the OpenAI API and I'm trying to wrap my head around how the pricing works, especially regarding the 'Price per 1M tokens' model. I want to make sure I don't burn through all my tokens while testing, whether in the API or the playground. So, if I take the GPT-4.1 model, where the pricing is: Input: $2.00, Cached: $0.50, Output: $8.00, does that mean after using 1M tokens, I'd be charged a total of $10.50? Is there a specific webpage that explains this better? I haven't found one yet. Thanks for any clarification!
2 Answers
Not quite! You should divide the total price by 1,000,000 to understand how much you pay per token. So for GPT-4.1-nano: Input at $0.10, Cached at $0.025, and Output at $0.40 means you'd pay $0.525 per million tokens used if you combine them correctly. That’s how it works!
Think of tokens like words. For example, if you say, "Hi how are you?" that might use up 4 input tokens. When the AI responds, it uses output tokens, and if it needs prior context, those are cached tokens. When you're using the API, you'll get token usage in the response, which helps you keep track, so you won’t waste too many tokens while you experiment!
Got it, thanks for confirming! Just to clarify, if I use those rates with GPT-4.1-nano, makes sense that I'd be paying 0.525 for a million tokens, right?