I've been working in database development for about five years, and I want to close a knowledge gap I have. I learned the principles of database normalization, which involves breaking down data into efficient table schemas with relationships like cross tables. But I'm starting to question if it's really beneficial to split everything into multiple tables. For instance, if I have separate tables for invoice lines and headers, it means I have to use more joins to pull information together, which could slow things down. Plus, doesn't having more tables mean that I need more database transactions when saving data? What impact do all these joins have on reading performance? I'm starting to think that having too many small tables might actually be an anti-pattern.
Any insights on this?
2 Answers
When it comes to normalizing your data, think of it like organizing a storage room. If you go too far, you're basically left with a box for each individual item, which can be really impractical. But on the flip side, if everything is jumbled together, that doesn't help either. The key is to find a balance. Normalization should be the standard approach, but sometimes you might have to strategically undoes some of that organization to improve efficiency. Just ensure it's intentional rather than a lucky accident!
It's really situational. If your database is more focused on reporting, then a bit of denormalization can actually help. Sure, it means repeating some data in your lookup tables, but lessening the number of joins might be worth it in terms of performance. Just do some testing to see what works best for your specific data. It’s pretty common practice in the world of databases, pun intended!

Yeah, and I've heard that in CQRS you can optimize read vs write models that way as well. Just need to make careful decisions! What do you think?