Remove Availability Remove Latency Remove Presentation Remove Traffic
article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

Migrating Critical Traffic At Scale with No Downtime — Part 2 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Picture yourself enthralled by the latest episode of your beloved Netflix series, delighting in an uninterrupted, high-definition streaming experience. This is where large-scale system migrations come into play.

Traffic 279
article thumbnail

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

In the time since it was first presented as an advanced Mesos framework, Titus has transparently evolved from being built on top of Mesos to Kubernetes, handling an ever-increasing volume of containers. This blog post presents how our current iteration of Titus deals with high API call volumes by scaling out horizontally.

Cache 224
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Zero Configuration Service Mesh with On-Demand Cluster Discovery

The Netflix TechBlog

Since there were no existing solutions available, we needed to build them ourselves. To improve availability, we designed systems where components could fail separately and avoid single points of failure. Eureka and Ribbon presented a simple but powerful interface, which made adopting them easy.

Traffic 220
article thumbnail

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

Key Takeaways Critical performance indicators such as latency, CPU usage, memory utilization, hit rate, and number of connected clients/slaves/evictions must be monitored to maintain Redis’s high throughput and low latency capabilities. Similarly, an increased throughput signifies an intensive workload on a server and a larger latency.

Metrics 130
article thumbnail

The Best Way to Host MongoDB on DigitalOcean

Scalegrid

Azure and found that DigitalOcean performance was in line with, if not better, on both high throughput and low latency in the deployment. While adequate for low-traffic applications, small databases, and dev/test environments, we recommend against leveraging shared clusters for your MongoDB production deployments.

Azure 187
article thumbnail

What is a Distributed Storage System

Scalegrid

Key Takeaways Distributed storage systems benefit organizations by enhancing data availability, fault tolerance, and system scalability, leading to cost savings from reduced hardware needs, energy consumption, and personnel. Variations within these storage systems are called distributed file systems.

Storage 130
article thumbnail

Towards a Reliable Device Management Platform

The Netflix TechBlog

For example, when running tests, the state of the device will change from “available for testing” to “in test.” As such, we can see that the traffic load on the Device Management Platform’s control plane is very dynamic over time. Over the lifecycle of a device connected to the RAE, the device can change attributes at any time.

Latency 213