article thumbnail

Chaos Data Engineering Manifesto: 5 Laws for Successful Failures

DZone

A powerful surge of traffic is inevitable. It's midnight in the dim and cluttered office of The New York Times, currently serving as the "situation room." During every major election, the wave would crest and crash against our overwhelmed systems before receding, allowing us to assess the damage.

article thumbnail

How HubSpot Uses Apache Kafka Swimlanes for Timely Processing of Workflow Actions

InfoQ

HubSpot adopted routing messages over multiple Kafka topics (called swimlanes) for the same producer to avoid the build-up in the consumer group lag and prioritize the processing of real-time traffic. By Rafal Gancarz

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

How LinkedIn Serves Over 4.8 Million Member Profiles per Second

InfoQ

LinkedIn introduced Couchbase as a centralized caching tier for scaling member profile reads to handle increasing traffic that has outgrown their existing database cluster. The new solution achieved over 99% hit rate, helped reduce tail latencies by more than 60% and costs by 10% annually. By Rafal Gancarz

Cache 85
article thumbnail

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

While our engineering teams have and continue to build solutions to lighten this cognitive load (better guardrails, improved tooling, …), data and its derived products are critical elements to understanding, optimizing and abstracting our infrastructure. What will be the cost of rolling out the winning cell of an AB test to all users?

article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

Since then, open-source Metaflow has gained support for Argo Workflows , a Kubernetes-native orchestrator, as well as support for Airflow which is still widely used by data engineering teams. The back-end auto-scales the number of instances used to back your service based on traffic.

Systems 226
article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Wednesday?—?December The talk also includes examples of using these tools in the Amazon Elastic Compute Cloud (Amazon EC2) cloud.

AWS 100
article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

Netflix shares how Amazon EC2 Auto Scaling allows its infrastructure to automatically adapt to changing traffic patterns in order to keep its audience entertained and its costs on target. Wednesday?—?December The talk also includes examples of using these tools in the Amazon Elastic Compute Cloud (Amazon EC2) cloud.

AWS 100