Seeking Feedback on My High-Performance Storage Engine in Python

0
6
Asked By CuriousCoder33 On

Hey everyone! I'm diving into a project called PyLensDBLv1, a storage engine focused on minimizing read and update latencies. I've managed to stabilize the MVP, and now I need some insights on two main architectural decisions. The core idea is to use memory-mapped files so that disk storage acts like an extension of the process's virtual address space. With a fixed-width binary schema enforced by dataclass decorators, my engine avoids a lot of overhead typical in traditional databases. I'm currently evaluating its performance against SQLite and wanted to tap into the community's expertise, especially regarding handling relational data and commit-time memory management as I plan for potential scaling. Here are two specific questions I have:

1. **Relationships**: Should I implement native foreign key support despite the challenges of maintaining pointers during data relocation? Would it be better to keep things flat and handle joins at the application level?

2. **Relocation Strategy**: As my database grows, I'm concerned about the bottleneck caused by my current atomic shadow-swap method for file rewriting. Are there better practices for maintaining contiguity without needing a full rewrite?

I'd love to hear your thoughts, especially regarding the feasibility of this 'calculation over search' approach for production use!

2 Answers

Answered By TechSavvyNerd On

Sounds like a fascinating project you've got going on! I think the idea of not implementing foreign keys and letting the application handle relationships might be a better route, especially for performance. You could avoid overhead by maintaining a flat structure. Plus, it gives you flexibility on how data is queried and managed without the complexity of enforcing constraints at the database level. If future performance issues arise, there might be ways to optimize joins at the application layer instead.

DataDude99 -

But wouldn't that mean more manual work for developers? Having foreign keys could simplify data integrity checks, even if it adds some overhead. It really depends on the specific use cases you're targeting.

Answered By DevChallenger45 On

I get where you're coming from with the atomic shadow-swap, but you might want to look into techniques like copy-on-write or incremental updates to keep that contiguity without rewriting entire files each time you make changes. It could save you a lot of headaches as your database scales up. Either way, your focus on high performance is definitely the right approach, just make sure to choose the method that balances ease of use and efficiency!

CuriousCoder33 -

Thanks for the advice! I'll definitely research copy-on-write techniques. I'm just eager to find a balance between performance and maintainability as the project grows.

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.