Architecture, Design, Latency and Systems - Technology Performance Pulse

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

DZone

FEBRUARY 27, 2024

Leveraging this hierarchical structure can significantly reduce latency and improve overall performance.

Cache

Cache Efficiency Architecture Design

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

The Netflix TechBlog

SEPTEMBER 29, 2022

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support for Non-Parallelizable Workloads by Kostas Christidis Introduction Timestone is a high-throughput, low-latency priority queueing system we built in-house to support the needs of Cosmos , our media encoding platform.

Latency

Latency Systems Media Serverless

Designing Instagram

High Scalability

JANUARY 11, 2022

Design a photo-sharing platform similar to Instagram where users can upload their photos and share it with their followers. High Level Design. Architecture. The streaming data store makes the system extensible to support other use-cases (e.g. System Components. Component Design. API Design.

Design

Design Media Storage Logistics

Rapid Event Notification System at Netflix

The Netflix TechBlog

FEBRUARY 18, 2022

To this end, we developed a Rapid Event Notification System (RENO) to support use cases that require server initiated communication with devices in a scalable and extensible manner. In this blog post, we will give an overview of the Rapid Event Notification System at Netflix and share some of the learnings we gained along the way.

Systems

Systems Traffic Architecture Mobile

What is a Distributed Storage System

Scalegrid

FEBRUARY 8, 2024

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. This guide delves into how these systems work, the challenges they solve, and their essential role in businesses and technology.

Storage

Storage Systems Big Data Azure

Rebuilding Netflix Video Processing Pipeline with Microservices

The Netflix TechBlog

JANUARY 10, 2024

This architecture shift greatly reduced the processing latency and increased system resiliency. By integrating with studio content systems, we enabled the pipeline to leverage rich metadata from the creative side and create more engaging member experiences like interactive storytelling.

Processing

Processing Media Latency Innovation

Consistent caching mechanism in Titus Gateway

The Netflix TechBlog

NOVEMBER 3, 2022

As the number of Titus users increased over the years, the load and pressure on the system increased substantially. The original assumptions and architectural choices were no longer viable. Overview The figure below depicts a simplified high-level architecture of a single Titus cluster (a.k.a

Cache

Cache Latency Traffic Systems

Dynatrace accelerates business transformation with new AI observability solution

Dynatrace

JANUARY 31, 2024

GenAI is prone to erratic behavior due to unforeseen data scenarios or underlying system issues. Figure 1: Sample RAG architecture While this approach significantly improves the response quality of GenAI applications, it also introduces new challenges.

Cache

Cache Azure Infrastructure Monitoring

Supercomputing Predictions: Custom CPUs, CXL3.0, and Petalith Architectures

Adrian Cockcroft

JANUARY 20, 2023

on Myths and Legends of High Performance Computing — it’s a somewhat light-hearted look at some of the same issues by the leader of the team that built the Fugaku system I mention below. Next generation architectures will use CXL3.0 HPCG is led by Japan’s RIKEN Fugaku system at 16 petaflops, which is 3% of it’s peak capacity.

Architecture

Architecture Latency Benchmarking AWS

Netflix Cloud Packaging in the Terabyte Era

The Netflix TechBlog

SEPTEMBER 24, 2021

Table 1: Movie and File Size Examples Initial Architecture A simplified view of our initial cloud video processing pipeline is illustrated in the following diagram. Lastly, the packager kicks in, adding a system layer to the asset, making it ready to be consumed by the clients.

Cloud

Cloud Media Storage Cache

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a fast, multi-threaded architecture for basic caching needs. Redis Data Types and Structures The design of Redis’s data structures emphasizes versatility.

Cache

Cache Storage Scalability Architecture

Engineering dependability and fault tolerance in a distributed system

High Scalability

FEBRUARY 19, 2021

In this article, we discuss the concepts of dependability and fault tolerance in detail and explain how the Ably platform is designed with fault tolerant approaches to uphold its dependability guarantees. Fault tolerant design approaches address these shortfalls to provide continuity both to business and to the user experience.

Engineering

Engineering Systems Scalability Availability

Orbital edge computing: nano satellite constellations as a new class of computer system

The Morning Paper

OCTOBER 11, 2020

Orbital edge computing: nanosatellite constellations as a new class of computer system , Denby & Lucia, ASPLOS’20. Only space system architects don’t call it request-response, they call it a ‘ bent-pipe architecture.’. Nanosatellite systems have a GSD of around 3.0m/px. Satellites are changing! Physical constraints.

Systems

Systems Latency Architecture Energy

Zero Configuration Service Mesh with On-Demand Cluster Discovery

The Netflix TechBlog

AUGUST 29, 2023

Today we have a wealth of tools, both OSS and commercial, all designed for cloud-native environments. To improve availability, we designed systems where components could fail separately and avoid single points of failure. In 2010, however, nearly none of it existed: the CNCF wasn’t formed until 2015!

Traffic

Traffic Latency Cloud C++

SLOG: serializable, low-latency, geo-replicated transactions

The Morning Paper

SEPTEMBER 3, 2019

SLOG: serializable, low-latency, geo-replicated transactions Ren et al., SLOG is another research system motivated by the needs of the application developer (aka, user!). Building correct applications is much easier when the system provides strict serializability guarantees. VLDB’19. Is my data at home?

Latency

Latency Processing Benchmarking Systems

Mastering Hybrid Cloud Strategy

Scalegrid

MARCH 14, 2024

Understanding Hybrid Cloud Strategy A hybrid cloud merges the capabilities of public and private clouds into a singular, coherent system. The architecture usually integrates several private, public, and on-premises infrastructures. We will examine each of these elements in more detail.

Strategy

Strategy Cloud Artificial Intelligence Infrastructure

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

The Netflix TechBlog

JUNE 13, 2023

This is where large-scale system migrations come into play. By collecting and analyzing key performance metrics of the service over time, we can assess the impact of the new changes and determine if they meet the availability, latency, and performance requirements. But what happens when this machinery needs a transformation?

Traffic

Traffic Metrics Systems Strategy

Data ingestion pipeline with Operation Management

The Netflix TechBlog

MARCH 7, 2023

We designed a unique concept called Annotation Operations which allows teams to create data pipelines and easily write annotations without worrying about access patterns of their data from different applications. But we cannot search or present low latency retrievals from files Etc.

Media

Media Latency Architecture Database

The Netflix Cosmos Platform

The Netflix TechBlog

MARCH 1, 2021

It supports both high throughput services that consume hundreds of thousands of CPUs at a time, and latency-sensitive workloads where humans are waiting for the results of a computation. The first generation of this system went live with the streaming launch in 2007. Delivery?—?A

Serverless

Serverless Media Latency Social Media

Netflix Video Quality at Scale with Cosmos Microservices

The Netflix TechBlog

NOVEMBER 2, 2021

For example, when we design a new version of VMAF, we need to effectively roll it out throughout the entire Netflix catalog of movies and TV shows. This article explains how we designed microservices and workflows on top of the Cosmos platform to bolster such video quality innovations. We call this system Cosmos.

Media

Media Innovation Metrics Latency

How To Scale a Single-Host PostgreSQL Database With Citus

Percona

NOVEMBER 3, 2023

Rather than listing the concepts, function calls, etc, available in Citus, which frankly is a bit boring, I’m going to explore scaling out a database system starting with a single host. I won’t cover all the features but show just enough that you’ll want to see more of what you can learn to accomplish for yourself.

Database

Database Benchmarking Latency C++

Implementing AWS well-architected pillars with automated workflows

Dynatrace

SEPTEMBER 13, 2023

This is a set of best practices and guidelines that help you design and operate reliable, secure, efficient, cost-effective, and sustainable systems in the cloud. Storing frequently accessed data in faster storage, usually in-memory caching, improves data retrieval speed and overall system performance. Beyond

AWS

AWS Efficiency Azure Cloud

What is observability? Not just logs, metrics and traces

Dynatrace

OCTOBER 1, 2021

As dynamic systems architectures increase in complexity and scale, IT teams face mounting pressure to track and respond to conditions and issues across their multi-cloud environments. Dynatrace news. But what is observability? Why is it important, and what can it actually help organizations achieve? What is observability?

Metrics

Metrics Open Source Monitoring Infrastructure

MongoDB Best Practices: Security, Data Modeling, & Schema Design

Percona

APRIL 17, 2023

In this blog post, we will discuss the best practices on the MongoDB ecosystem applied at the Operating System (OS) and MongoDB levels. Operating System (OS) settings Swappiness Swappiness is a Linux kernel setting that influences the behavior of the Virtual Memory manager when it needs to allocate a swap, ranging from 0-100.

Best Practices

Best Practices Design Tuning Database

Artificial Intelligence in Cloud Computing

Scalegrid

JANUARY 8, 2024

AI algorithms embedded in cloud architecture automate repetitive processes, streamlining workloads and reducing the chance of human error. AI models integrated into cloud systems offer flexibility, enable agile methodologies, and ensure secure systems. These services are tailored to meet various business requirements.

Artificial Intelligence

Artificial Intelligence Cloud Scalability Analytics

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Dynatrace

DECEMBER 15, 2022

This transition to public, private, and hybrid cloud is driving organizations to automate and virtualize IT operations to lower costs and optimize cloud processes and systems. Besides the traditional system hardware, storage, routers, and software, ITOps also includes virtual components of the network and cloud infrastructure.

Artificial Intelligence

Artificial Intelligence DevOps Hardware Virtualization

Current status, needs, and challenges in Heterogeneous and Composable Memory from the HCM workshop (HPCA’23)

ACM Sigarch

MAY 31, 2023

Introduction Memory systems are evolving into heterogeneous and composable architectures. Heterogeneous and Composable Memory (HCM) offers a feasible solution for terabyte- or petabyte-scale systems, addressing the performance and efficiency demands of emerging big-data applications. The recently announced CXL3.0

Latency

Latency Hardware Cache Architecture

Edge Authentication and Token-Agnostic Identity Propagation

The Netflix TechBlog

FEBRUARY 9, 2021

The whole system was quite complex, and starting to become brittle. Plus, the architecture of the Edge tier was evolving to a PaaS (platform as a service) model, and we had some tough decisions to make about how, and where, to handle identity token handling. The API server orchestrates backend systems to authenticate the user.

Architecture

Architecture Latency Servers Website

Towards a Unified Theory of Web Performance

Alex Russell

FEBRUARY 28, 2022

Here are two renderings of the same Gmail inbox in different architectural styles: one based on Ajax, and the other on "basic" HTML : The Ajax version of Gmail loads 4.8MiB of resources, including 3.8MiB of JavaScript to load an inbox containing two messages. The system is ready to receive input. A Battle Between Two Teams #.

Performance

Performance Latency Architecture Network

Choosing a cloud DBMS: architectures and tradeoffs

The Morning Paper

AUGUST 29, 2019

Choosing a cloud DBMS: architectures and tradeoffs Tan et al., We focused on OLAP-oriented parallel data warehouse products available for AWS and restricted our attention to commercially available systems. The design space. Each systems begins from a cold start unless explicitly stated otherwise in the results.

Architecture

Architecture Cloud Storage Serverless

Optimizing Video Streaming CDN Architecture for Cost Reduction and Enhanced Streaming Performance

IO River

NOVEMBER 2, 2023

They need to deliver impeccable performance without breaking the bank.According to recent industry statistics, global streaming has seen an uptick of 30% in the past year, underscoring the importance of efficient CDN architecture strategies. Login & Authentication: Systems that verify user credentials and maintain session-specific data.Â

Architecture

Architecture Performance Internet Internet

Optimizing Video Streaming CDN Architecture for Cost Reduction and Enhanced Streaming Performance

IO River

NOVEMBER 2, 2023

They need to deliver impeccable performance without breaking the bank.According to recent industry statistics, global streaming has seen an uptick of 30% in the past year, underscoring the importance of efficient CDN architecture strategies. Login & Authentication: Systems that verify user credentials and maintain session-specific data.

Architecture

Architecture Performance Internet Internet

Towards a Reliable Device Management Platform

The Netflix TechBlog

AUGUST 30, 2021

System Setup Architecture The following diagram summarizes the architecture description: Figure 1: Event-sourcing architecture of the Device Management Platform. Fault Tolerance If the underlying KafkaConsumer crashes due to ephemeral system or network events, it should be automatically restarted.

Latency

Latency Traffic Transportation Hardware

Plan Your Multi Cloud Strategy

Scalegrid

MARCH 22, 2024

They can also bolster uptime and limit latency issues or potential downtimes. Register now for free and experience the seamless operation of your databases across multi-cloud and hybrid-cloud systems. By spreading your data and apps around, you can get your systems to work together more smoothly and make the most out of your budget.

Strategy

Strategy Cloud Government Innovation

Netflix Drive

The Netflix TechBlog

MAY 5, 2021

In the future posts, we will do an architectural deep dive into the several components of Netflix Drive. Netflix Drive relies on a data store that will be the persistent storage layer for assets, and a metadata store which will provide a relevant mapping from the file system hierarchy to the data store entities.

Media

Media Storage Architecture Cloud

Seamlessly Swapping the API backend of the Netflix Android app

The Netflix TechBlog

SEPTEMBER 8, 2020

We tried a few iterations of what this new service should look like, and eventually settled on a modern architecture that aimed to give more control of the API experience to the client teams. For us, it means that we now need to have ~15 MDN tabs open when writing routes :) Let’s briefly discuss the architecture of this microservice.

Latency

Latency Cache Java Traffic

Three Other Models of Computer System Performance: Part 2

ACM Sigarch

MARCH 25, 2019

How many buffers are needed to track pending requests as a function of needed bandwidth and expected latency? Can one both minimize latency and maximize throughput for unscheduled work? The M/M/1 queue will show us a required trade-off among (a) allowing unscheduled task arrivals, (b) minimizing latency, and (c) maximizing throughput.

Systems

Systems Latency Performance C++

Predictive CPU isolation of containers at Netflix

The Netflix TechBlog

JUNE 4, 2019

Because microprocessors are so fast, computer architecture design has evolved towards adding various levels of caching between compute units and the main memory, in order to hide the latency of bringing the bits to the brains. This avoids thrashing caches too much for B and evens out the pressure on the L3 caches of the machine.

Cache

Cache Latency Airlines Logistics

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Brendan Gregg

FEBRUARY 28, 2023

This talk originated from my updates to [Systems Performance 2nd Edition], and this was the first time I've given this talk in person! CXL in a way allows a custom memory controller to be added to a system, to increase memory capacity, bandwidth, and overall performance. Ford, et al., “TCP

Performance

Performance Latency Cache Virtualization

Scalable MicroService Architecture

VoltDB

JULY 10, 2018

As the complexity of applications and systems increases, the size of the teams that work on these also increase. In these scenarios, having the system as a monolithic one inhibits the development team from being able to move forward at speed. In these use cases, data processing usually has less than a 5 milliseconds latency budget.

Architecture

Architecture Scalability Ecommerce Latency

Scalable MicroService Architecture

VoltDB

JULY 10, 2018

As the complexity of applications and systems increases, the size of the teams that work on these also increase. In these scenarios, having the system as a monolithic one inhibits the development team from being able to move forward at speed. In these use cases, data processing usually has less than a 5 milliseconds latency budget.

Architecture

Architecture Scalability Ecommerce Latency

What is AWS Lambda?

Dynatrace

APRIL 5, 2021

AWS Lambda enables organizations to access many types of functions from AWS’ cloud-based services, such as: Data processing, to execute code based on triggers, system states, or user actions. You will likely need to write code to integrate systems and handle complex tasks or incoming network requests.

Lambda

Lambda AWS Serverless Hardware

Helios: hyperscale indexing for the cloud & edge – part 1

The Morning Paper

OCTOBER 26, 2020

As a production system within Microsoft capturing around a quadrillion events and indexing 16 trillion search keys per day it would be interesting in its own right, but there’s a lot more to it than that. These two narratives of reference architecture and ingestion/indexing system are interwoven throughout the paper.

Cloud

Cloud Big Data Latency Architecture

Growth Engineering at Netflix- Creating a Scalable Offers Platform

The Netflix TechBlog

FEBRUARY 9, 2021

In particular, it’s our job to design and build the systems and protocols that enable customers from all over the world to sign up for Netflix with the plan features and incentives that best suit their needs. Let’s take a deeper look at the architecture, protocols, and systems involved.

Engineering

Engineering Scalability Architecture Innovation

Architectural Insights: Designing Efficient Multi-Layered Caching With Instagram Example

Timestone: Netflix’s High-Throughput, Low-Latency Priority Queueing System with Built-in Support…

Trending Sources

Designing Instagram

Rapid Event Notification System at Netflix

What is a Distributed Storage System

Rebuilding Netflix Video Processing Pipeline with Microservices

Consistent caching mechanism in Titus Gateway

Dynatrace accelerates business transformation with new AI observability solution

Supercomputing Predictions: Custom CPUs, CXL3.0, and Petalith Architectures

Netflix Cloud Packaging in the Terabyte Era

Redis vs Memcached in 2024

Engineering dependability and fault tolerance in a distributed system

Orbital edge computing: nano satellite constellations as a new class of computer system

Zero Configuration Service Mesh with On-Demand Cluster Discovery

SLOG: serializable, low-latency, geo-replicated transactions

Mastering Hybrid Cloud Strategy

Migrating Critical Traffic At Scale with No Downtime?—?Part 2

Data ingestion pipeline with Operation Management

The Netflix Cosmos Platform

Netflix Video Quality at Scale with Cosmos Microservices

How To Scale a Single-Host PostgreSQL Database With Citus

Implementing AWS well-architected pillars with automated workflows

What is observability? Not just logs, metrics and traces

MongoDB Best Practices: Security, Data Modeling, & Schema Design

Artificial Intelligence in Cloud Computing

What is ITOps? Why IT operations is more crucial than ever in a multicloud world

Current status, needs, and challenges in Heterogeneous and Composable Memory from the HCM workshop (HPCA’23)

Edge Authentication and Token-Agnostic Identity Propagation

Towards a Unified Theory of Web Performance

Choosing a cloud DBMS: architectures and tradeoffs

Optimizing Video Streaming CDN Architecture for Cost Reduction and Enhanced Streaming Performance

Optimizing Video Streaming CDN Architecture for Cost Reduction and Enhanced Streaming Performance

Towards a Reliable Device Management Platform

Plan Your Multi Cloud Strategy

Netflix Drive

Seamlessly Swapping the API backend of the Netflix Android app

Three Other Models of Computer System Performance: Part 2

Predictive CPU isolation of containers at Netflix

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Scalable MicroService Architecture

Scalable MicroService Architecture

What is AWS Lambda?

Helios: hyperscale indexing for the cloud & edge – part 1

Growth Engineering at Netflix- Creating a Scalable Offers Platform

Stay Connected