I've been trying to understand the term 'high model' in the context of AI. Does it imply that it's using more compute power and potentially taking longer to process? Also, I'm curious about why smaller models, like the mini versions, seem to outperform larger ones. Is it the case that reasoning with these smaller models during testing is the best approach?
3 Answers
You kind of answered your own question there! Smaller models can outperform larger ones by using their resources more efficiently, especially in reasoning tasks.
OpenAI's naming conventions can be a bit odd. A 'high model' doesn't necessarily mean it's larger; it often just indicates that it can think or process information more effectively. Sometimes, smaller models that are labeled as high can perform surprisingly well by maximizing their capabilities through clever strategies.
Yeah, that makes sense. Thanks for clarifying!
When we talk about a 'high model,' we're not discussing a different model entirely; it actually means applying more compute power to the same model. On the other hand, when we mention mini models, like the o4-mini, we're referring to a condensed version of a larger model. So, an 'o4-mini-high' applies more compute (and tokens) for reasoning but is still based on the smaller version.
That's interesting! But do you know how longer compute time during tests leads to better results?

True, just wanted to confirm that. Thanks a lot! XD