article thumbnail

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

Now let’s look at how we designed the tracing infrastructure that powers Edgar. This insight led us to build Edgar: a distributed tracing infrastructure and user experience. Our distributed tracing infrastructure is grouped into three sections: tracer library instrumentation, stream processing, and storage.

article thumbnail

DevOps engineer tools: Deploy, test, evaluate, repeat

Dynatrace

DevOps platform engineers are responsible for cloud platform availability and performance, as well as the efficiency of virtual bandwidth, routers, switches, virtual private networks, firewalls, and network management. Open source automated browser and testing tool. Infrastructure as code (IaC) configuration management tool.

DevOps 196
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Managing risk for financial services: The secret to visibility and control during times of volatility

Dynatrace

This blog explores how vertically integrated risk management solutions that use AI and automation enable unparalleled visibility, control, and efficiency for risk management in banking. Optimize the IT infrastructure supporting risk management processes and controls for maximum performance and resilience.

Analytics 206
article thumbnail

How observability, application security, and AI enhance DevOps and platform engineering maturity

Dynatrace

Rather, they must be bolstered by additional technological investments to ensure reliability, security, and efficiency. Observability of applications and infrastructure serves as a critical foundation for DevOps and platform engineering, offering a comprehensive view into system performance and behavior.

DevOps 195
article thumbnail

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

Central engineering teams enable this operational model by reducing the cognitive burden on innovation teams through solutions related to securing, scaling and strengthening (resilience) the infrastructure. All these micro-services are currently operated in AWS cloud infrastructure.

article thumbnail

Key Elements of Site Reliability Engineering (SRE)

DZone

Site Reliability Engineering (SRE) is a systematic and data-driven approach to improving the reliability, scalability, and efficiency of systems. This article discusses the key elements of SRE, including reliability goals and objectives, reliability testing, workload modeling, chaos engineering, and infrastructure readiness testing.

article thumbnail

What is an A/B Test?

The Netflix TechBlog

Martin Tingley with Wenjing Zheng , Simon Ejdemyr , Stephanie Lane , and Colin McFarland This is the second post in a multi-part series on how Netflix uses A/B tests to inform decisions and continuously innovate on our products. An A/B test is a simple controlled experiment. Figure 2: A simple A/B test. Let’s say?—?this

Testing 245