Cache, Code, Latency and Traffic - Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This approach has a handful of benefits.

Traffic

Traffic Latency Tuning Systems

Seeing through hardware counters: a journey to threefold performance increase

The Netflix TechBlog

NOVEMBER 9, 2022

A quick canary test was free of errors and showed lower latency, which is expected given that our standard canary setup routes an equal amount of traffic to both the baseline running on 4xl and the canary on 12xl. What’s worse, average latency degraded by more than 50%, with both CPU and latency patterns becoming more “choppy.”

Hardware

Hardware Cache Performance Latency

Crucial Redis Monitoring Metrics You Must Watch

Scalegrid

JANUARY 25, 2024

Key Takeaways Critical performance indicators such as latency, CPU usage, memory utilization, hit rate, and number of connected clients/slaves/evictions must be monitored to maintain Redis’s high throughput and low latency capabilities. It can achieve impressive performance, handling up to 50 million operations per second.

Metrics

Metrics Monitoring Latency Cache

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Redis is better suited for complex data models, and Memcached is better suited for high-throughput, string-based caching scenarios.

Cache

Cache Storage Scalability Architecture

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

On the Android team, while most of our time is spent working on the app, we are also responsible for maintaining this backend that our app communicates with, and its orchestration code. Image taken from a previously published blog post As you can see, our code was just a part (#2 in the diagram) of this monolithic service.

Latency

Latency Cache Java Traffic

How We Optimized Performance To Serve A Global Audience

Smashing Magazine

AUGUST 3, 2023

It increases our visibility and enables us to draw a steady stream of organic (or “free”) traffic to our site. While paid marketing strategies like Google Ads play a part in our approach as well, enhancing our organic traffic remains a major priority. The higher our organic traffic, the more profitable we become as a company.

Performance

Performance Cache Traffic Metrics

Percentiles don’t work: Analyzing the distribution of response times for web services

Adrian Cockcroft

JANUARY 29, 2023

There is no way to model how much more traffic you can send to that system before it exceeds it’s SLA. Every opportunity for delay due to more work than the best case or more time waiting than the best case increases the latency and they all add up and create a long tail. Mu is the mean of each component, the latency.

Lambda

Lambda Latency Cache C++

Stuff The Internet Says On Scalability For July 20th, 2018

High Scalability

JULY 20, 2018

Cliff Click : The JVM is very good at eliminating the cost of code abstraction, but not the cost of data abstraction. That means multiple data indirections mean multiple cache misses. Mark LaPedus : MRAM, a next-generation memory type, is being touted as a replacement for embedded flash and cache applications.

Internet

Internet Internet Scalability Automotive

Proof of Concept: Horizontal Write Scaling for MySQL With Kubernetes Operator

Percona

MAY 15, 2023

Normally this solution requires a full code redesign and could be quite difficult to achieve when it is injected after the initial code architecture definition. As illustrated above, ProxySQL allows us to set up a common entry point for the application and then redirect the traffic on the base of identified sharding keys.

Traffic

Traffic Scalability Database Servers

A one size fits all database doesn't fit anyone

All Things Distributed

JUNE 21, 2018

Developers rely on the functionality of the relational database (not the application code) to enforce the schema and preserve the referential integrity of the data within the database. The purpose of DynamoDB is to provide consistent single-digit millisecond latency for any scale of workloads.

Database

Database AWS Games Latency

Front-End Performance Checklist 2021

Smashing Magazine

JANUARY 11, 2021

Have we optimized enough with tree-shaking, scope hoisting, code-splitting, and all the fancy loading patterns with intersection observer, progressive hydration, clients hints, HTTP/3, service workers and — oh my — edge workers? It’s much easier to reach performance goals when the code base is fresh or is just being refactored.

Performance

Performance Cache Media Metrics

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Nonetheless, we found a number of limitations that could not satisfy our requirements e.g. stalling the processing of log events until a dump is complete, missing ability to trigger dumps on demand, or implementations that block write traffic by using table locks. Blocking write traffic by locking tables. Writing events to any output.

Database

Database Traffic Transportation Open Source

DBLog: A Generic Change-Data-Capture Framework

The Netflix TechBlog

DECEMBER 17, 2019

Nonetheless, we found a number of limitations that could not satisfy our requirements e.g. stalling the processing of log events until a dump is complete, missing ability to trigger dumps on demand, or implementations that block write traffic by using table locks. Blocking write traffic by locking tables. Writing events to any output.

Database

Database Traffic Transportation Open Source

The Surprising Effectiveness of Non-Overlapping, Sensitivity-Based Performance Models

John McCalpin

APRIL 2, 2020

Here I assumed a particular analytical function for the amount of memory traffic as a function of cache size to scale the bandwidth time. Only talking about CPU2006 results today – the CPU2000 results look similar (see the 2007 presentation linked above), but the CPU2000 benchmark codes are less closely related to real applications.

Benchmarking

Benchmarking Performance Latency Architecture

Service Workers can save the environment!

Dean Hume

APRIL 24, 2018

While this may not seem significant for websites with low traffic, as traffic to the site begins to increase, so does the amount of energy consumed. Without effective caching on the client, the server will see an increase in workload, more CPU usage and ultimately increased latency for the end user. Show me the money!

Energy

Energy Cache Traffic Website

Service Workers can save the environment!

Dean Hume

APRIL 24, 2018

While this may not seem significant for websites with low traffic, as traffic to the site begins to increase, so does the amount of energy consumed. Without effective caching on the client, the server will see an increase in workload, more CPU usage and ultimately increased latency for the end user. Show me the money!

Energy

Energy Cache Traffic Website

Service Workers can save the environment!

Dean Hume

APRIL 24, 2018

While this may not seem significant for websites with low traffic, as traffic to the site begins to increase, so does the amount of energy consumed. Without effective caching on the client, the server will see an increase in workload, more CPU usage and ultimately increased latency for the end user. Show me the money!

Energy

Energy Cache Traffic Website

How Google PageSpeed Works: Improve Your Score and Search Engine Ranking

CSS - Tricks

JULY 25, 2019

Cache-Headers missing? Estimated Input Latency. Estimated Input Latency. Where possible, remove unused JavaScript code or focus on only delivering a script that will be run by the current page. This approach is known as code splitting and is extremely effective in improving TTI. What changed in PageSpeed 5.0?

Google

Google Engineering Speed Mobile

Front-End Performance Checklist 2020 [PDF, Apple Pages, MS Word]

Smashing Magazine

JANUARY 6, 2020

Is it worth exploring tree-shaking, scope hoisting, code-splitting, and all the fancy loading patterns with intersection observer, server push, clients hints, HTTP/2, service workers and — oh my — edge workers? It’s much easier to reach performance goals when the code base is fresh or is just being refactored.

Performance

Performance Cache Network Metrics

Content Management Systems of the Future: Headless, JAMstack, ADN and Functions at the Edge

Abhishek Tiwari

NOVEMBER 3, 2018

You should expect one-time implementation cost (depending CMS and business requirements it can cost 200,000 USD to 3M USD) and yearly hosting infrastructure cost (proportional to load and traffic but typically 30,000 USD - 300,000 USD per year). Due to strong templating support, a website managed by SSG can be truly modular.

Systems

Systems Cache Website Network

Front-End Performance Checklist 2019 [PDF, Apple Pages, MS Word]

Smashing Magazine

JANUARY 7, 2019

Is it worth exploring tree-shaking, scope hoisting, code-splitting, and all the fancy loading patterns with intersection observer, server push, clients hints, HTTP/2, service workers and — oh my — edge workers? Estimated Input Latency tells us if we are hitting that threshold, and ideally, it should be below 50ms.

Performance

Performance Cache Metrics Network

How To Avoid Landing Page Redirects (10 min read)

Rigor

JULY 2, 2019

As an example, we see that the preferred address to access the online edition of the New York Times is [link] (we received an HTTP response code of 200 when requesting the site). Rigor’s waterfall chart indicates that this address has been permanently moved (given that we received an HTTP response code of 301 ).

Mobile

Mobile Traffic Google Latency

Why I hate MPI (from a performance analysis perspective)

John McCalpin

AUGUST 1, 2018

Bandwidth, performance analysis has two recurring themes: How fast should this code (or “simple” variations on this code) run on this hardware? Interacting components in the execution of an MPI job — a brief outline (from memory): The user source code, which contains an ordered set of calls to MPI routines.

Hardware

Hardware Transportation Performance Latency

Synthetic Monitoring vs. RUM

Rigor

DECEMBER 19, 2019

The measured traffic is not of your actual users; it is synthetically generated to collect data on page performance. Synthetic monitoring actively allows users to monitor the performance of their website or application with a set of controlled variables (geography, network, device, browser, cached vs. uncached) over time.

Monitoring

Monitoring Benchmarking Website Traffic

Can You Afford It?: Real-world Web Performance Budgets

Alex Russell

OCTOBER 22, 2017

For this page to be done loading it needs to be responsive to user input — the “interactive” in “Time to Interactive” Browsers process user input by generating DOM events that application code listens to. Simulated packet loss and variable latency, however, can make benchmarking extremely difficult and slow.

Performance

Performance Benchmarking Network Mobile

Revisiting “Serverless Architectures”

The Symphonia

MAY 22, 2018

I was a little restricted in my thinking the first time around and I’ve come to see FaaS as something not quite stateless, since caching state in a Lambda instance that might stick around for 5 hours is a perfectly reasonable idea. I also rewrote the section on Startup Latency since Cold Starts are one of the big “FUD” areas of Serverless.

Serverless

Serverless Architecture Lambda Azure

Lessons Learned Rebuilding A Large E-Commerce Website With Next.js (Case Study)

Smashing Magazine

SEPTEMBER 24, 2021

That was until we went to production with our highest traffic customer. To mitigate the performance issues, we had to add a lot of (unbudgeted) extra servers and had to aggressively cache pages on a reverse proxy. It can be hosted on a CDN like Vercel or Netlify, which results in lower latency. Lint And Format Your Code.

Website

Website Code Servers Analytics

HTTP/3: Performance Improvements (Part 2)

Smashing Magazine

AUGUST 22, 2021

Because we are dealing with network protocols here, we will mainly look at network aspects, of which two are most important: latency and bandwidth. Latency can be roughly defined as the time it takes to send a packet from point A (say, the client) to point B (the server). Two-way latency is often called round-trip time (RTT).

Performance

Performance Network Latency Servers

HTTP/3 From A To Z: Core Concepts (Part 1)

Smashing Magazine

AUGUST 9, 2021

You’ve probably heard things like: “HTTP/3 is much faster than HTTP/2 when there is packet loss”, or “HTTP/3 connections have less latency and take less time to set up”, and probably “HTTP/3 can send data more quickly and can send more resources in parallel”. TLS, TCP, and QUIC handshake durations ( Large preview ).

Transportation

Transportation Internet Internet Network

Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Seeing through hardware counters: a journey to threefold performance increase

Trending Sources

Crucial Redis Monitoring Metrics You Must Watch

Redis vs Memcached in 2024

Seamlessly Swapping the API backend of the Netflix Android app

How We Optimized Performance To Serve A Global Audience

Percentiles don’t work: Analyzing the distribution of response times for web services

Stuff The Internet Says On Scalability For July 20th, 2018

Proof of Concept: Horizontal Write Scaling for MySQL With Kubernetes Operator

A one size fits all database doesn't fit anyone

Front-End Performance Checklist 2021

DBLog: A Generic Change-Data-Capture Framework

DBLog: A Generic Change-Data-Capture Framework

The Surprising Effectiveness of Non-Overlapping, Sensitivity-Based Performance Models

Service Workers can save the environment!

Service Workers can save the environment!

Service Workers can save the environment!

How Google PageSpeed Works: Improve Your Score and Search Engine Ranking

Front-End Performance Checklist 2020 [PDF, Apple Pages, MS Word]

Content Management Systems of the Future: Headless, JAMstack, ADN and Functions at the Edge

Front-End Performance Checklist 2019 [PDF, Apple Pages, MS Word]

How To Avoid Landing Page Redirects (10 min read)

Why I hate MPI (from a performance analysis perspective)

Synthetic Monitoring vs. RUM

Can You Afford It?: Real-world Web Performance Budgets

Revisiting “Serverless Architectures”

Lessons Learned Rebuilding A Large E-Commerce Website With Next.js (Case Study)

HTTP/3: Performance Improvements (Part 2)

HTTP/3 From A To Z: Core Concepts (Part 1)

Stay Connected