I've noticed that OpenAI significantly slashed the pricing for their O3 API by 80%, bringing it in line with the cost of GPT-4.1. Since O3 seems to be based on the 4.1 model, does this suggest they were previously overcharging? OpenAI claims the price drop is due to improved efficiency in inference. If that's the case, why not use the same efficiency improvements on their other models?
2 Answers
It looks like if they keep seeking more funding, it suggests they're still in a phase of growth and likely not turning a profit yet. If they were genuinely making money, they wouldn't need to continuously raise billions from investors—they could just fund themselves instead.
One possibility could be that increasing their GPU purchases is part of a strategy to scale their hardware capacity. It's surprising how often this detail is overlooked in discussions. People tend to jump to conclusions without considering the data storage and processing needs.
But how does buying more GPUs decrease the overall cost of running them? After all, that's what impacts the cost per token for the API. Also, it's worth noting that they aren't buying GPUs outright—they lease them from Azure, which handles the actual purchasing.
That logic doesn't hold up. Major companies like Tesla and Google also attract investors while being profitable. Plus, Deepseek has shown they're making money on their inference processes. So why would OpenAI's O3, which has comparable performance to R1, be a money loser? It might be a case of OpenAI lagging behind Deepseek's technology.