Welcome to the Cloudshalla Engineering Blog! We break down the real, unfiltered truths of DevOps, Cloud, and Platform Engineering fresh from the production trenches. If you are serious about stepping up your career, you are in exactly the right place.

The Audit That Saved ₹3.1 Lakhs/Month

Cloud Engineering Architecture

In 2023, I was brought in to audit a fintech startup's AWS bill. They were spending ₹4.2 lakhs/month and couldn't figure out why. After 2 weeks of analysis and 6 weeks of changes, we brought it to ₹1.1 lakhs/month. Same workload. 74% reduction. Here's exactly what we did.

Finding 1: Zombie Resources (₹60K/month saved)

Unattached EBS volumes, unused Elastic IPs, snapshots from servers deleted 8 months ago, stopped EC2 instances still getting charged for EBS. Run this to find them:

# Find unattached EBS volumes
aws ec2 describe-volumes \
  --filters Name=status,Values=available \
  --query 'Volumes[*].[VolumeId,Size,CreateTime]' \
  --output table

# Find unassociated Elastic IPs
aws ec2 describe-addresses \
  --query 'Addresses[?AssociationId==null].[PublicIp,AllocationId]' \
  --output table

Finding 2: Wrong Instance Types (₹80K/month saved)

They were running m5.xlarge instances averaging 8% CPU utilization. Classic over-provisioning. AWS Compute Optimizer gave us the recommendations. We moved to t3.large instances with burstable performance — same workload, 60% cheaper compute.

  • Enable AWS Compute Optimizer (free service)
  • Review recommendations weekly for 4 weeks
  • Rightsizing rule: if average CPU < 20% for 2 weeks → downsize

Finding 3: No Reserved Instances (₹1.2L/month saved)

They were running every instance on-demand. Reserved Instances for 1-year commitments give 40% savings. 3-year gives 60%. We converted all stable, always-on workloads to 1-year Reserved Instances. This alone saved ₹1.2 lakhs/month.

Rule: Any EC2 instance running 24/7 for more than 3 months should be Reserved. No exceptions.

Finding 4: S3 Storage Classes (₹35K/month saved)

All data — including 2-year-old logs — was sitting in S3 Standard at ₹2.3/GB/month. We set up lifecycle policies to move data automatically:

# S3 Lifecycle Policy (via Terraform)
lifecycle_rule {
  enabled = true
  transition {
    days          = 30
    storage_class = "STANDARD_IA"  # 58% cheaper than Standard
  }
  transition {
    days          = 90
    storage_class = "GLACIER_IR"   # 68% cheaper than Standard
  }
  expiration {
    days = 365  # Delete after 1 year (adjust per compliance needs)
  }
}

Finding 5: NAT Gateway Overuse (₹45K/month saved)

Every byte passing through a NAT Gateway costs money. They had Lambda functions in a VPC making thousands of API calls per hour — all going through NAT. We moved those specific Lambdas outside the VPC and used VPC endpoints for S3/DynamoDB. ₹45K/month gone.

💡 Quick Answer: To reduce AWS costs by 70%: delete zombie resources (unattached EBS, unused EIPs), rightsize instances using Compute Optimizer, convert always-on instances to Reserved Instances (40–60% savings), set S3 lifecycle policies to move old data to cheaper tiers, and reduce NAT Gateway usage with VPC endpoints.

Ready to stop learning theory and start building real projects? Join the Cloudshalla masterclasses to get 1-on-1 mentorship, break into top-tier DevOps roles, and master cloud automation today.