article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. The batch job creates a high-level summary that captures some key comparison metrics.

Traffic 339
article thumbnail

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

The data platform is built on top of several distributed systems, and due to the inherent nature of these systems, it is inevitable that these workloads run into failures periodically. This blog will explore these two systems and how they perform auto-diagnosis and remediation across our Big Data Platform and Real-time infrastructure.

Big Data 238
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

How Netflix uses eBPF flow logs at scale for network insight

The Netflix TechBlog

The Flow Exporter also publishes various operational metrics to Atlas. These metrics are visualized using Lumen , a self-service dashboarding infrastructure. After several iterations of the architecture and some tuning, the solution has proven to be able to scale. So how do we ingest and enrich these flows at scale ?

Network 325
article thumbnail

A guide to Autonomous Performance Optimization

Dynatrace

If you want to see a more hands-on approach, I encourage you to watch the recording as Stefano did a live demo of Akamas’s integration with Dynatrace, showing how to minimize the footprint of a Java application with automated JVM tuning. Akamas also enables you to automate the analysis of the experiment metrics in powerful ways.

article thumbnail

Delta: A Data Synchronization and Enrichment Platform

The Netflix TechBlog

Operating Delta applications is made simple for users as the framework provides resilience and failure tolerance out of the box and collects many granular metrics that can be used for alerts. Please stay tuned. Below is a view of the high level architecture of the Delta platform.

article thumbnail

Web Performance Bookshelf

Rigor

Take, for example, The Web Almanac , the golden collection of Big Data combined with the collective intelligence from most of the authors listed below, brilliantly spearheaded by Google’s @rick_viscomi. How to pioneer new metrics and create a culture of performance. Web Performance Tuning. Time is Money.

article thumbnail

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Dynatrace

How do you get more value from petabytes of exponentially exploding, increasingly heterogeneous data? The short answer: The three pillars of observability—logs, metrics, and traces—converging on a data lakehouse. To solve this problem, Dynatrace launched Grail, its causational data lakehouse , in 2022.

Analytics 179