Hey everyone, I'm currently working on a college project that involves creating a political map website. The idea is to allow users to click on a state to select their senator or representative and view their recent fundraising totals.
The main challenge I'm facing is importing data. I need to handle two files—one is 120,000 KB and the other is a whopping 10.7 GB. Until now, I've only worked with datasets of about 10,500 rows, so this is a big leap for me!
Right now, I'm using Cloudflare D1, but I've run into some issues with importing the data. The workers are timing out, especially with the large file, so I'm considering whether to switch to a different platform despite liking Cloudflare.
Has anybody done something similar with Cloudflare? If not, what alternatives would you recommend for managing this data effectively?
7 Answers
Before you begin importing, take some time to clean your data. If it really is 10GB, check if there’s anything unnecessary and prune it down as needed. This could save you a lot of hassle later on!
For big imports like this, Cloudflare Workers might not be the way to go since they are designed for quick tasks. Consider using a temporary VPS for the data import, break your 10GB file into smaller chunks, and import those sequentially. That way, if something crashes, you won't lose everything.
Totally with you! Regarding Cloudflare, you could also look into using R2 for file storage and natural migration to something like Supabase for database management.
SQLite could really handle this task well, especially since it's lightweight and easy to set up. You could import the data locally and then push it to Cloudflare when it's more manageable.
That's a great idea! I think starting local will definitely help you avoid the timeouts you're experiencing.
If you're looking for cheap options, consider setting up an open-source database server on a Virtual Private Server (VPS). There are services that can host a database for around $10/month, which might work out better for your needs.
Digital Ocean has affordable plans starting at like $4-6! That's great for getting a simple setup running.
I highly recommend PostgreSQL for your project. It's robust and can manage large data sets like yours without breaking a sweat. Just format your data as CSV files, and importing it should be straightforward.
Check out this Python script that uses the Cloudflare D1 API for importing: [GitHub Gist Link]. It can help you batch import your data without hitting those annoying timeouts. Make sure to process in smaller sets too!
Honestly, 10GB is not a huge deal—you can use various databases like MySQL or SQLite. The key is to format your data wisely and import it in batches. It might take a little time, but you'll get it sorted!

Agreed! If you can process your data beforehand, you'll save a ton of time and resources during the import.