Hey everyone! I'm currently using a p4de.24xlarge instance to run a deep learning pipeline with AlphaFold3, which processes a lot of protein data. This instance has eight local SSDs and I'm planning to store a large sequence database (about 700 GB) on these SSDs. Since I'm running eight inference jobs simultaneously, each using a single GPU, I'm concerned that reading from one SSD could slow things down. So, I'm considering copying the database to each SSD. Is that a good approach, or is there a better AWS solution that offers fast read speeds right from boot that can handle the load effectively?
3 Answers
You might want to ensure your code can accurately target the appropriate SSD path for each job. Also, consider checking out AWS FSx for Lustre if you're looking for shared, high-speed storage that can be booted with your instance. That could improve performance significantly.
Your instance offers 4x 100 Gibps Network Performance, so a straightforward solution would be to use AWS CLI with CRT enabled for downloading the database from a nearby S3 bucket at the start of each job. It keeps things simple!
Here are a few things to think about:
1. What's your required read speed?
2. You might want to consider:
- AWS EFS or FSx for sharing drives across EC2 instances. EFS One Zone Storage is fairly inexpensive—around $30 for 700GB—and allows multiple sources to read from it.
- Alternatively, maybe just use S3 to transfer it directly to each local SSD connected to your instance?
Keep in mind that costs can vary; EFS/FSx could get pricey compared to having one SSD per instance.
Related Questions
Fix Not Being Able To Add New Categories With Intuitive Category Checklist For Wordpress
Get Real User IP Without Installing Cloudflare Apache Module
How to Get Total Line Count In Visual Studio 2013 Without Addons
Install and Configure PhpMyAdmin on Centos 7
How To Setup PostfixAdmin With Dovecot and Postfix Virtual Mailbox
Dovecot Error Unknown database driver mysql