I'm completely new to programming and haven't done much in the past. I tried to learn but got lost without a specific project. Now I'm excited about a new idea! I'm thinking about building something that scans websites for books based on certain themes. For example, I'd like to search Kakuyomu for novels that mention サッカー in their summaries or chapters, and I'd also like to do the same on Goodreads for books mentioning 'soccer' in either summaries or reviews. I want the results to show the book titles along with links to where the word appears. How difficult do you think this would be to create? What programming language would be best for this? Thanks!
1 Answer
It’s actually not too hard! Web scraping is quite beginner-friendly. I recommend using Python with libraries like BeautifulSoup or Scrapy since they really simplify the process of extracting text from websites. Just keep in mind that sites like Goodreads may have anti-scraping measures, so using their API might be necessary. For smaller sites like Kakuyomu, basic scraping should work just fine!

Robots.txt is interesting to me.