I'm new to working with large language models (LLMs), so I could be missing something obvious here. I've been using Anthropics Claude 3.5 on AWS Bedrock via a Python Lambda function. The standard model invocation has a limit of 4096 tokens, which is causing issues when I submit prompts that sometimes lead to errors due to the response being cut off at that limit. I'm asking Claude to return structured JSON outputs. I've noticed that sometimes it runs out of tokens, specifically when I see the max_token stop reason. I've come up with a few potential strategies to handle this: 1) Optimize my prompts to try and keep them under the limit, though I'm not sure how reliable that will be. 2) Switch to a conversational method, which increases the token limit to 8192, but there's still a chance that it could exceed that in rare cases. 3) Use the conversational method in a loop that continues until the max_token limit is reached and then combine the results. I'm curious if anyone has additional suggestions or ways to improve my current approaches. Thanks in advance!
1 Answer
Have you considered prompting the LLM to call a function to manage the JSON generation? This could streamline things and avoid some of those token issues.

Related Questions
How To: Running Codex CLI on Windows with Azure OpenAI
Set Wordpress Featured Image Using Javascript
How To Fix PHP Random Being The Same
Why no WebP Support with Wordpress
Replace Wordpress Cron With Linux Cron
Customize Yoast Canonical URL Programmatically