How can I keep my context window lean while scraping multiple websites?

0
0
Asked By CuriousCoder93 On

I'm currently on the Claude Max plan and working on a project to scrape data from 350 websites. Each site is unique, but I use similar Python scripts across the board, relying on Playwright for navigation and the Gemini API to parse information. While Claude is impressive and often handles tasks efficiently, it does struggle with specific cases like async loads and iframes. As I process each website, I notice that occasionally the context window runs out of space and auto-compacts, which can happen multiple times in a short period. I'm considering changing my approach. I've read about leveraging CLAUDE.md files for documentation—should I let Claude write its own notes as it works? I'd like it to remember how it interacts with sites, especially those with iframes, rather than starting from scratch each time. Do you think summarizing and taking notes after each site, perhaps through some rules, would be beneficial?

1 Answer

Answered By DataDiver42 On

You can definitely utilize CLAUDE.md files for your project. These files can work in subfolders, meaning you could have a dedicated CLAUDE.md for the websites with iframes. This way, as Claude navigates through each site, it can refer to the relevant notes you've set up for that folder.

CuriousCoder93 -

Thank you! Should I manually edit the CLAUDE.md files, or is it better to let Claude handle that? I’d prefer it to take its own notes and reference them at the start of each website script. Is that the right strategy?

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.