I'm looking for some advice on using Python specifically for data gathering tasks in my job. What are the essential Python skills or libraries I should focus on to effectively gather data? I'd really appreciate any insights or tips you all have.
2 Answers
I've worked with Beautiful Soup 4 for parsing older websites. For tasks that involve more interaction, I switched to Selenium and Playwright. Nowadays, I tend to focus more on working through APIs, which can be much more straightforward.
I work with Python to scrape vital data from a vendor's website that's supposed to deliver us business-critical information. It's been almost 2 years, and I utilize Python for the whole process: scraping, cleaning, and pushing data to our database. Right now, I'm using Playwright, which has been great!
In my previous job, we mainly used Beautiful Soup for straightforward web pages and Selenium for more complex, dynamic content. Recently, we've transitioned to Playwright for all new projects and even started updating older ones.