I've been using Bedrock for Anthropic's models like Sonnet and Opus, and I've noticed they're functioning significantly slower—between 2 and 10 times slower—than when I use Azure, Google Cloud, or even Anthropic's own API. This is making it largely unsuitable for many of my projects. Is there any information available regarding the expected performance of these models on Bedrock?
2 Answers
Actually, I've found that the Anthropic models on Bedrock can be faster than other providers, including Anthropic's own offerings. Remember, Bedrock has two APIs: non-streaming and streaming versions. By default, Anthropic streams data. You might want to switch your code to use the streaming API; that could boost your performance. Also, make sure you're using global inference if you aren't already!
That's interesting to hear, but what kind of latency and token throughput are you experiencing on Bedrock compared to other platforms? I've noticed my models sometimes get a bit 'congested' too, but AWS seems reluctant to admit there's an issue unless you’re spending a lot. Just keep in mind it's an on-demand service without any performance guarantees.

Related Questions
Neural Network Simulation Tool
xAI Grok Token Calculator
DeepSeek Token Calculator
Google Gemini Token Calculator
Meta LLaMA Token Calculator
OpenAI Token Calculator