Latency, Metrics and Software Engineering - Technology Performance Pulse

Why applying chaos engineering to data-intensive applications matters

Dynatrace

MAY 23, 2024

Stream processing One approach to such a challenging scenario is stream processing, a computing paradigm and software architectural style for data-intensive software systems that emerged to cope with requirements for near real-time processing of massive amounts of data. This significantly increases event latency.

Engineering

Engineering Tuning Latency Open Source

Site reliability done right: 5 SRE best practices that deliver on business objectives

Dynatrace

MAY 31, 2023

As a result, site reliability has emerged as a critical success metric for many organizations. Site reliability engineering (SRE) has recently become a critical discipline in recent years as the world has shifted in favor of web-based interactions. But the transition to SRE maturity is not always easy. Service-level objectives (SLOs).

Best Practices

Best Practices DevOps Latency Metrics

Software engineering for machine learning: a case study

The Morning Paper

JULY 7, 2019

Software engineering for machine learning: a case study Amershi et al., More specifically, we’ll be looking at the results of an internal study with over 500 participants designed to figure out how product development and software engineering is changing at Microsoft with the rise of AI and ML. ICSE’19.

Software Engineering

Software Engineering Engineering Software Software

Application observability meets developer observability: Unlock a 360º view of your environment

Dynatrace

NOVEMBER 6, 2023

In a recent webinar , Dynatrace DevOps activist Andi Grabner and senior software engineer Yarden Laifenfeld explored developer observability. When an incident occurs, developers need to know what data to look at, where the incident occurred, and other relevant metrics. How do you know if this problem has business impact?

Development

Development DevOps Programming Cloud

DevOps observability: A guide for DevOps and DevSecOps teams

Dynatrace

JANUARY 18, 2023

Site reliability engineering (SRE) is a software operations methodology that enables organizations to create highly reliable and scalable applications. SRE applies software engineering principles to operations and infrastructure processes. Site reliability engineers, or SREs, lead these efforts. Congratulations!

DevOps

DevOps Best Practices Innovation Strategy

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

In this session, we discuss the technologies used to run a global streaming company, growing at scale, billions of metrics, benefits of chaos in production, and how culture affects your velocity and uptime. Netflix runs dozens of stateful services on AWS under strict sub-millisecond tail-latency requirements, which brings unique challenges.

AWS

AWS Entertainment Open Source Benchmarking

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

In this session, we discuss the technologies used to run a global streaming company, growing at scale, billions of metrics, benefits of chaos in production, and how culture affects your velocity and uptime. Netflix runs dozens of stateful services on AWS under strict sub-millisecond tail-latency requirements, which brings unique challenges.

AWS

AWS Entertainment Open Source Benchmarking

Automating chaos experiments in production

The Morning Paper

JULY 4, 2019

Moreover, just like an A/B test, we’ll be collecting metrics while the experiment is underway and performing statistical analysis at the end to interpret the results. Two failure modes we focus on are a service becoming slower (increase in response latency) or a service failing outright (returning errors).

Latency

Latency Engineering Metrics Traffic

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

In this session, we discuss the technologies used to run a global streaming company, growing at scale, billions of metrics, benefits of chaos in production, and how culture affects your velocity and uptime. Netflix runs dozens of stateful services on AWS under strict sub-millisecond tail-latency requirements, which brings unique challenges.

AWS

AWS Entertainment Open Source Benchmarking

Edge Authentication and Token-Agnostic Identity Propagation

The Netflix TechBlog

FEBRUARY 9, 2021

By offloading token processing from these systems to the central Edge Authentication Services, downstream systems saw significant gains in CPU, request latency, and garbage collection metrics, all of which help reduce cluster footprint and cloud costs. And, we’re hiring Senior Software Engineers !

Architecture

Architecture Latency Servers Website

Curbing Connection Churn in Zuul

The Netflix TechBlog

AUGUST 16, 2023

It seems like a minor change, but it had to be seamlessly integrated into our existing metrics and connection bookkeeping. We saw improvements across all key metrics on Zuul, but most importantly, there was a significant reduction in total connection counts and churn. Subsetting Success The results were outstanding.

Traffic

Traffic Servers Google Metrics

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

As our business scales globally, the demand for data is growing and the needs for scalable low latency incremental processing begin to emerge. It serves thousands of users, including data scientists, data engineers, machine learning engineers, software engineers, content producers, and business analysts, in various use cases.

Processing

Processing Big Data Efficiency Engineering

Technology Performance Pulse

Why applying chaos engineering to data-intensive applications matters

Site reliability done right: 5 SRE best practices that deliver on business objectives

Trending Sources

Software engineering for machine learning: a case study

Application observability meets developer observability: Unlock a 360º view of your environment

DevOps observability: A guide for DevOps and DevSecOps teams

Netflix at AWS re:Invent 2019

Netflix at AWS re:Invent 2019

Automating chaos experiments in production

Netflix at AWS re:Invent 2019

Edge Authentication and Token-Agnostic Identity Propagation

Curbing Connection Churn in Zuul

Incremental Processing using Netflix Maestro and Apache Iceberg

Stay Connected