I'm in the process of switching our Fireworks.ai Kimi K2.5 implementation over to Bedrock because of a company requirement. So far, Fireworks.ai has been great—fast and reliable. However, I'm having major issues with Bedrock. The documentation had incorrect information about setting up Thinking, but I found the right parameters through the AWS console chat playground. Now, though, the Kimi deployment is highly inconsistent; it often stops outputting tokens well before reaching the maximum allowed. I'm curious if anyone else is facing these issues—it really feels like we're dealing with a beta version here.
1 Answer
I totally get your frustration! I've been running tests with a script that makes the same complex call to Kimi K2.5 ten times. When I run it on Fireworks.ai, it works perfectly every time. But with Bedrock? I'm only seeing a success rate of around 30-40%. It's really disappointing, and it definitely seems like AWS needs to address these problems soon!

Is it still performing poorly like that?