Is My EC2 Vertical Scaling Approach for a Data Ingestion Task on AWS Sound?

0
10
Asked By CloudySky34 On

Hey everyone! I'm a newbie cloud architect and I just got my certification last month. I recently gained access to my company's AWS console and I'm hoping to get some feedback on my approach to scaling an EC2 instance for a data ingestion task.

We currently have a single EC2 instance that serves a low-traffic client website, but we also run a scheduled data ingestion job on the first of every month. This job tends to crash the server because it consumes a lot of resources during those initial days. The developer asked for a temporary increase in the server's specs during this time. An outage for a few minutes during the scaling process is okay, and we'd like the clients to notice only a brief blip.

My proposed solution involves using **SSM Automation** to:
1. Stop the EC2 instance
2. Change the `InstanceType`
3. Restart the instance

I plan to trigger it with **EventBridge Scheduler** rules to scale up on the 1st at 00:05 JST and scale down on the 8th at the same time. I'm also considering wrapping this whole operation in a **CloudFormation template** to make it reusable for other instances in the future.

Does this approach align with industry best practices, or am I potentially overengineering? I'm the only one here with a cloud architect perspective, so any insights would be super helpful! P.S. This is my first post on Reddit!

4 Answers

Answered By DataNerd77 On

Just a heads up, I think more context is needed for your situation. What exactly does the ingestion job do? Where is it pulling data from? And what's causing the crashes? If it’s mostly due to CPU or RAM issues, it might help to run the ingestion on a separate instance or even use services like AWS Glue or Batch. This way, you can keep your web server performing optimally while running heavy jobs elsewhere. If we knew more about the data dependencies and bottlenecks, we could give you even better advice!

CloudySky34 -

Thanks for your feedback! I've added more details to the post regarding my situation. I'm really just trying to learn as much as I can here.

Answered By TechWhiz_91 On

Your plan is technically feasible but there are some architectural concerns you should address. Mixing your live site with a resource-intensive ingestion job can be risky. It’s usually best to separate these workloads. Consider running your ingestion jobs on a service designed for batch processing like AWS Batch, ECS on Fargate, or even Lambda if possible. This way, you can manage and scale resources without impacting your website's performance.

Also, from a resilience standpoint, having just one EC2 instance can lead to downtime. You might want to have at least two instances behind your Elastic Load Balancer in an Auto Scaling Group. This way, if one instance fails or is being resized, your website stays up. While your automation approach is clever, ensure that you're prepared for any potential issues like resize failures.

SysAdminGuru -

Great suggestion! Using serverless options can save a lot of hassle. If there's any way to shift workloads away from that EC2 instance, it would definitely reduce complexity.

NewbieArchitect123 -

Thanks for the feedback! I'm doing this for learning purposes, and I can't refactor the existing codebase. But I'll be sure to keep your suggestions in mind.

Answered By CuriousTechie On

I'm not sure about your choice to have an Elastic IP attached to an instance behind an ALB. Isn't that unnecessary? If you’re just running background jobs, have you thought about why that EIP is needed?

CloudySky34 -

Honestly, I’m not entirely sure. That was just how the instance was set up when I got access. I didn't want to release it until I was sure it wasn't being used anywhere.

Answered By PragmaticDev On

If you want a minimal adjustment, consider separating the web server from the ingestion job without extensive refactoring. You could introduce a new variable that lets you control if the job runs on the same instance or a different one. This way, the webserver remains stable during the monthly loads. Would that work for you?

Related Questions

LEAVE A REPLY

Please enter your comment!
Please enter your name here

This site uses Akismet to reduce spam. Learn how your comment data is processed.