Hey folks! I'm developing a train tracking website using Cloudflare Pages, and I've run into a bit of a snag. I've managed to get the next 27 hours of train schedules from the train company's API, but the response comes as an unwieldy 88,000 line JSON file. I want to save a trimmed version of this data for my website to access it locally when users visit. My plan is to refresh this data each day around 1-2am and keep only what's necessary. While I'm building my understanding of Cloudflare web development, I'm looking for advice on how to effectively handle such a large JSON file and integrate it into my project. So far, I've set up a cron trigger for the API token and am getting the hang of that, but now I need to process this massive JSON file without slowing things down. Thanks in advance for any tips!
4 Answers
Yeah, I'd pull the JSON, iterate over it, and extract the data you really need. Then, transform it into a simpler structure and store that in a database. You can set up a daily cron job for this and just insert the data into your DB.
Why does everyone suggest using a database? For your case, it sounds like you just need to transform the JSON into something more usable without updating it frequently. Setting up a GraphQL service just for reading one JSON file might be overkill.
If you're unsure about handling such a massive JSON, tools like `jq` can help you "query" your JSON and split it into more manageable files. Just write a script to convert the original into your required format and add that to your cron job.
I suggest consuming the API response and then transforming it to store only the needed data in a database. This way, querying it from your app will be a lot easier.
But the point is to avoid hitting the API too often. Having your own database allows you to pull updates daily without repeated queries.