Hey everyone! I wanted to share my experience with the newly announced GPT-Image-1 model from OpenAI. Initially, their announcement made it sound like this model brings all the amazing image features that GPT-4o offers, which have become pretty popular. However, after diving into the API, I've found some limitations that aren't immediately obvious.
While you can create cool images from scratch, the editing capabilities are quite basic. For example, I realized that you can't upload an image and ask it to do transformations like creating a Studio Ghibli version of it or changing it to look like a character from The Simpsons or the Muppets. The API just doesn't support that right now.
I spent hours figuring things out, and eventually got an explanation from ChatGPT about why the API feels limited compared to the chat app. Essentially, the public API right now doesn't replicate the full experience of what you get with the ChatGPT-4o app, particularly with uploading images for stylization.
So I'm left wondering when we might see these features in the API. Am I alone in feeling misled by the announcement? It seemed like a more limited version of what GPT-4o can do, especially when the DALL-E API seems to miss out on the magic we see in ChatGPT-4o. Any thoughts or rumors on this?
3 Answers
You do realize that asking AI about its own capabilities can lead to confusion, right? If it doesn't answer thoroughly the first time, it might just be 'hallucinating.' Let's be real here. How many times do we need to understand this?
Have you actually tried including an image in the messages object when you call the API? I was wondering if that might help unlock some of its capabilities, though I haven't had much luck myself.
I'm definitely waiting for these features too! It feels like there was a lot of hype and now a letdown. The API just doesn't match up to what they advertised, and I'm frustrated trying to do things I expected to be possible.
Fair point! I did start by following the API documentation from OpenAI, not ChatGPT. It was only after struggling for a while that I double-checked with ChatGPT and found out the limitations. But if the API work matches what’s in the ChatGPT app, I’d love to know how you got it to work!