Soccer fans around the world have just embraced a full month of excitement, brought about by the 2014 FIFA World Cup. At Conviva, we had our own share of excitement building up to a record traffic crescendo by the time the USA v. Germany game came around.
Conviva’s real-time Intelligent Control Platform met the challenge of processing unprecedented levels of global traffic while maintaining liveliness of data and high availability. In doing so, our platform reached several milestones over the course of the event, highlighting its scale and power:
- 28+ billion viewer minutes
- 700+ million unique viewers
- 3.2 million peak concurrent plays
As video continues its push to the online medium, we have proved that Quality of Experience has a significant impact on user viewing behavior, as documented in our 2014 Viewer Experience Report. Conviva addresses the challenge of maintaining viewer engagement by providing visibility and optimization, ensuring a high-quality user experience for our video streaming customers’ end users.
In this blog posting, we focus on one aspect of the scaling exercise that was undertaken to support the World Cup: How we leveraged a geo-diverse, hybrid-cloud deployment to meet the availability, throughput and liveness requirements for what turned out to be a breakthrough event for live sports streaming.
(For a look at engagement trends highlighted by World Cup data, take a look here.)
The Conviva platform can be abstracted into three major component groups:
- Gateway and messaging layer, which handles the communication with the video players and logs QoS analytics, as well as responds back with Optimization Decisions. This is the entry point to the platform and is both availability- and latency-sensitive.
- Aggregation and compute layer, which consists of a low-latency streaming map-reduce, built in-house for real-time analytics, as well as a historical data computation workflow built on top ofApache Hadoop.
- Storage, query and presentation layer, which ensures that we are storing data based on optimizing querying patterns for use in Pulse, our web portal.
Conviva’s real-time platform is unique in the industry and is a key differentiator when it comes to live-event streaming analytics. Our platform has been designed to meet the following, often conflicting, requirements:
- High Availability – Any critical-path services should stay up in the face of individual server failures, a datacenter/availability zone failure or network partition. The gateway and messaging layer thereby needs to have high resiliency.
- Horizontal Scalability – System capacity should linearly scale with traffic by adding more hardware.
- High Throughput and Low latency – Important for inter-datacenter transfers.
- Traffic Bursts – Need to absorb unusual spikes in load generated by viral video events.
- Disaster Recovery – Multi-tier data persistence is required at multiple geographical locations so we can recover from catastrophic failures.
Optimizing the Gateway and Messaging Layer
The gateway and messaging layer happens to be most sensitive to latency and reliability requirements. Any service-impacting issue at this layer may lead to irrecoverable data loss as the data hasn’t yet hit persistence layers.
Because of how we manage session state across the platform, this layer fortunately happens to also be easier to scale up. In essence, the gateway layer’s main functionality can be expressed as a ‘Map’ function without a ‘Reduce’ step, to draw a parallel with the MapReduce concept; this makes it horizontally scalable as long as the data can still be chunked.
Conviva’s multi-tenant platform is capable of being configured to partition traffic by various dimensions (e.g. traffic from different customers may be sent to different geographical datacenters). This capability allowed us to leverage our multi-datacenter hybrid cloud deployment tuned specifically for World Cup traffic.
So, for example, bringing the gateway layers closer to the end users in multiple major geographic regions allowed us to reduce the critical path round trip time between the video players and Conviva.
The low-latency requirement, however, required more work than simply choosing the appropriate type and location of cloud deployment. Data transmission over TCP between two physically far apart locations is afflicted by the well-known LFN problem, where the high-bandwidth latency product may lead to potentially limited throughput, unless carefully tuned.
After evaluating some options, we decided to use Apache Kafka as our primary inter-datacenter messaging layer. This decision was informed by Kafka’s design choice of strong consistency and fault-tolerance guarantees, also important requirements for us. Kafka comes with a tool called MirrorMaker for mirroring the data between two Kafka clusters, potentially across a Long Fat Network. To maximize the throughput over the LFN link, after the appropriate TCP tuning, we decided to parallelize the data streams at the application layer. We filled up the fat pipe by using multiple concurrent MirrorMaker streams, pulling from different Kafka partitions at the same time over separate TCP channels. After scale assessments, we settled down on a configuration of 20 Kafka brokers, 240 partitions, replication factor of 2 and about 120 MirrorMaker threads for each regional data center to support the barrage of control plane traffic.
Because of the requirement to support bursty traffic – a highly popular live streaming event can have several times more traffic than the baseline – a dedicated datacenter was not really a choice, so we implemented our regional datacenters on several availability zones in the AWS public cloud. This allowed us to dynamically scale the gateway and messaging layer based on expected traffic forecasts for upcoming games.
High-availability was implemented across several dimensions. Latency-based global DNS load balancing, as well as per-customer configuration static mappings, were used to partition traffic across various datacenters. Multiple, redundant transit connections were set up between each pair of regional and aggregation datacenters with enough bandwidth to meet the peak traffic watermarks designated for each datacenter. To support aggregation of data from multiple regional datacenters we ended up rewriting the MirrorMaker codebase from scratch. In case of a regional failure of reachability to a datacenter, video player traffic would be distributed over the remaining available regions. Additionally, in the case of a catastrophic failure, the Conviva client libraries had built-in failover logic to avoid video playback failures and thereby protect the viewer experience.
One of the ground realities we had to be ready for was the occasional network hiccups in the AWS fabric, as well as those on the long-haul connection over the WAN. Having a loosely coupled architecture is very important in distributed systems, because the loss of some component due to the network issues should not leave another component hanging, unless it has no alternative. Our gateway layers have been designed with this principle in mind, so the impact of network issues was limited. As for the messaging layers, local buffering of topic data in Kafka/MirrorMakers helped, but periodically the MirrorMakers got into a hung state as a result of network hiccups and had to be restarted. By doing this at the messaging layer, we were able to abstract out the effect of a flaky network for the core applications running on top of this fabric.
Overall, the World Cup was a clear success for Conviva in general and the Engineering team in particular. As with all projects, this was the result of a significant team effort by a lot of passionate folks at Conviva. If you would like to work on such high-profile projects that literally impact everyone who can stream a video, anywhere in the world, please check out our Careers page. We are hiring and always looking for talented and passionate individuals to take on the next big problem in distributed systems, cloud computing, machine learning and video delivery.
~ Haijie Wu, Aaditya Ramesh and Faisal Zakaria Siddiqi