Struggling to Generate Large Datasets Using Bedrock, Any Tips?

0
2
Asked By CuriousCoder92 On

Hey everyone! I recently took part in a hackathon and received $300 in credits. I'm working on creating a synthetic data generator, but I'm hitting some snags. I want to generate thousands of rows of dataset, but when using Claude 3.7 on Bedrock, I can only get it to output about 100 rows at a time. I tried batching it in groups of 80, which let me reach 1000 rows, but it took around 13 minutes. Is there a way to speed this up? Maybe an async option or a different model? I've played with aioboto3, but it didn't pan out—possibly due to limitations of Claude 3.7.

Also, I ran my code earlier and it worked fine for generating the rows, but now I'm facing a read timeout error. Can anyone shed some light on what might be going wrong? I'd really appreciate any help!

1 Answer

Answered By DataGenie101 On

Have you considered generating the column values separately instead of making each row dependent on the LLM? It might help to develop some basic rules or mappings for your data. For example, generate the unique column values through LLM and then combine those with random values or formulas for numeric data. Using tools like Claude Code, Cline, or Roo Code through Bedrock might simplify the process too!

TechSavvy_88 -

That sounds interesting but won't the data end up being too generic? I really want it to reflect realistic patterns.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.