I've been really interested in creating my own AI project lately, focusing more on the programming side rather than research. Instead of just using APIs, I want to dive in and actually code, train, and play around a bit. For anyone who's gone down this path: did you choose a specific framework like PyTorch or TensorFlow, or did you go for something a bit higher-level? Also, what's a realistic size for a project if I want to get interesting outcomes? Lastly, any advice on how to handle datasets and preprocessing without getting completely overwhelmed?
2 Answers
When you say 'from scratch', what do you mean? If you’re looking to gather your own training data but use existing ML algorithms, that's a solid beginner project. Keep in mind, though, you'll need a lot of data to train effectively. For example, classifying images based on gender may require millions of examples. If you pick a simpler problem, like classifying books by their dimensions, you could work with just 100 examples and still see good results.
There's a great set of tutorials that I recommend. It’s a series from Twitch that walks you through coding in real-time, so you really see the whole process unfold. It’s super helpful for beginners!

I disagree! You don’t need that many examples for a gender classification model, especially if you're just starting out. I found a paper where they achieved over 97% accuracy with under 2000 images. The key is ensuring your data is consistent.