I'm working on a Python library aimed at enhancing I/O operations, but I've noticed that my improvements are primarily in the read operations with no significant gain in writing. I'm wondering if focusing solely on reading improvements is worthwhile enough to share with the community. What do you think?
4 Answers
There are definitely scenarios where improving file reading is super important, like in machine learning projects that deal with large datasets. The real key is figuring out how significant your improvements are and where they can be applied. It might be worth sharing if it helps specific use cases!
Could you share your benchmarking methodology and the specs of your hardware? It would help us understand if your results are accurate because either you've made a huge breakthrough or something might be off in the measurements. Looking forward to your project, though!
If you really can boost file I/O speeds over the standard library, go for it! Consider submitting a PR to the standard library too; it could benefit a lot of users.
I actually shared some benchmark results on Twitter. Here's the link if you're interested: [benchmark results](https://x.com/hafitoalimania/status/1923118761961210154?s=46&t=aumw9aIpe-j7gZNLT3DJfw)
The CPython I/O module is just a simple wrapper around libc I/O. For synchronous operations, you're already hitting the limit of what's possible. If you're targeting asynchronous I/O, there's more room for improvement, but it requires diving into OS-level async services.
Thanks for the insight! That totally makes sense.