I'm in a tough situation and really need help with some Python code for web scraping. I'm not a coder at all, but I have a website that's going to be taken down in two days, and the only way to save the information is through web scraping. Manually saving the data would take way too long. I've got all the necessary details mapped out from A to Z; I just don't know how to write the actual code. I've tried getting help from ChatGPT, but there's always some small mistake in the code it gives me, like missing or extra parentheses. Any advice would be appreciated!
3 Answers
I understand you’re in a tight spot. Just a heads up, many forums won’t provide code if it might go against a website's Terms of Service. Make sure you have the right to scrape that data! You might want to check if APIs are available for that site instead; it's usually a cleaner solution if supported and can give you the data without all the scraping hassle.
I get that you're in a rush! If the website is set up like a static page, one option could be using `wget` to mirror it. It won’t give you database content, though, just the HTML. But if you need to scrape data behind logins or sessions, that's a different ball game! You might need a more complex solution for that.
You mentioned needing to extract backend data like invoices and order history. For that, regular scraping might not be enough. You'll probably need to simulate browser behavior with libraries like Selenium or BeautifulSoup, especially if you have to log in to access the data. But keep in mind, it can get a bit tricky if the site uses a lot of JavaScript. Good luck!

Yeah, that's the issue. I need to grab data from a database, not just the public content.