article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

4:45pm-5:45pm NFX 202 A day in the life of a Netflix Engineer Dave Hahn , SRE Engineering Manager Abstract : Netflix is a large, ever-changing ecosystem serving millions of customers across the globe through cloud-based systems and a globally distributed CDN. We explore all the systems necessary to make and stream content from Netflix.

AWS 100
article thumbnail

Netflix at AWS re:Invent 2019

The Netflix TechBlog

4:45pm-5:45pm NFX 202 A day in the life of a Netflix Engineer Dave Hahn , SRE Engineering Manager Abstract : Netflix is a large, ever-changing ecosystem serving millions of customers across the globe through cloud-based systems and a globally distributed CDN. We explore all the systems necessary to make and stream content from Netflix.

AWS 100
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

Uptime Institute’s 2022 Outage Analysis report found that over 60% of system outages resulted in at least $100,000 in total losses, up from 39% in 2019. At the lowest level, SLIs provide a view of service availability, latency, performance, and capacity across systems. Make SLOs realistic.

article thumbnail

USENIX LISA2021 Computing Performance: On the Horizon

Brendan Gregg

I also wrote about these topics in detail for my recent [Systems Performance 2nd Edition] book. TCP Extensions for Multipath Operation with Multiple Addresses,” [link] Mar 2020 - [Gregg 20] Brendan Gregg, “Systems Performance: Enterprise and the Cloud, Second Edition,” Addison-Wesley, 2020 - [Hruska 20] Joel Hruska, “Intel Demos PCIe 5.0

article thumbnail

Stuff The Internet Says On Scalability For March 22nd, 2019

High Scalability

µs of replication latency on lossy Ethernet, which is faster than or comparable to specialized replication systems that use programmable switches, FPGAs, or RDMA.". We achieve 5.5 We achieve 5.5 matthewstoller : I just looked at Netflix’s 10K. The company is burning through cash. $3B 3B this year, $4B next year.

Internet 134
article thumbnail

USENIX SREcon APAC 2022: Computing Performance: What's on the Horizon

Brendan Gregg

This talk originated from my updates to [Systems Performance 2nd Edition], and this was the first time I've given this talk in person! CXL in a way allows a custom memory controller to be added to a system, to increase memory capacity, bandwidth, and overall performance. Ford, et al., “TCP

article thumbnail

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

which is difficult when troubleshooting distributed systems. If we had an ID for each streaming session then distributed tracing could easily reconstruct session failure by providing service topology, retry and error tags, and latency measurements for all service calls.