Petabyte-Scale Cost Optimization: How a Video Hosting Platform Saved 70% on S3

FinOps Article

In today’s digital landscape, data is to business what air is to a human body—it keeps things running, growing, and evolving. However, as the volume of data spirals into petabytes and beyond, the need for effective data storage solutions becomes ever more pressing. Video hosting platforms are among the most data-intensive businesses, continuously striving for balance between operational efficiency and cost management. This is the story of a video hosting platform that leveraged AWS native tools to achieve a stunning 70% reduction in their Amazon S3 costs.

Understanding the Scale of the Challenge

Operating in a storage-intensive industry, the platform had amassed over one million full HD 1080p videos, equivalent to approximately 10 petabytes (PB) of data. Amazon Simple Storage Service (S3) had been their go-to choice for its scalability and cost-efficiency. However, as the business expanded, so did their Amazon S3 costs, comprising 40% of their total infrastructure expenses.

Utilizing the AWS Cost and Usage Report, the company performed a granular analysis of their S3 expenses, revealing that the bulk of their costs (88%) stemmed from using S3 Glacier Instant Retrieval (GIR). Surprisingly, GET API and retrieval charges were unusually high, signaling a misalignment between architectural design and real-world usage.

Socratic Journey: Evaluating Architectural Dynamics

At the heart of this platform was a Just-In-Time Processing (JITP) architecture. This innovative approach was intended to minimize costs by transcoding videos into specific formats only upon request, thereby eliminating the need to store multiple renditions. Although initially effective, the combination of S3 GIR’s low storage cost (approximately $4 per 1TB) with unexpectedly high retrieval and GET charges prompted a reassessment.

S3 Access Logging: Unmasking the Cost Culprits

Turning to Amazon S3 Access Logging, the team quickly identified the ’needle in the haystack’. Analyzing the access logs via Amazon Athena revealed that a tiny fraction of video files were disproportionately contributing to GET and Retrieval activity. Identifying this subset was pivotal in determining their shift to S3 Intelligent-Tiering, which avoids retrieval fees and offers GET costs significantly lower than GIR.

Architecting Cost Effectiveness Through Tiered Storage

By reclassifying the most frequently accessed objects to S3 Intelligent-Tiering, substantial savings were immediate. Nevertheless, the team’s insights went deeper, recognizing that merely 10% of the content responsible for the majority of the GET requests should initially reside in Intelligent-Tiering. This strategic use of multiple S3 storage classes proved to be a masterstroke, preventing further exacerbation of costs.

Optimizing Further: Beyond Storage Grains

Cost optimization’s final frontier lay beyond storage. Improvements were made to the pipeline’s infrastructure—primarily focusing on content delivery through Amazon CloudFront and data requests from their Nginx-based packaging layer. By enhancing CloudFront cache rates and reducing repetitive data fetches through optimized byte-range retrievals, GET request numbers dropped by 90%.

A Retrospective and Forward Path

Through meticulous analysis and strategic architectural adaptations, the platform achieved a 70% reduction in their annual S3 billing—equating to substantial financial savings. This case highlights the broader narrative of cloud technology’s dual-edge: while highly scalable, it demands vigilant cost management to harness its full potential.

This transformative journey underscores the importance of not just selecting the right tools but also understanding their optimal use. As businesses increasingly pivot towards cloud-native strategies, such illustrative success in architectural tuning is expected to be a guiding beacon for others in the field.

The combination of AWS’s robust toolkit and keen insights derived from data paves the way for efficient, responsive, and financially prudent operations. The journey of this video hosting platform is a testament to the power of in-depth analysis and effective cost-management strategies in the digital age.