article thumbnail

Building Netflix’s Distributed Tracing Infrastructure

The Netflix TechBlog

If we had an ID for each streaming session then distributed tracing could easily reconstruct session failure by providing service topology, retry and error tags, and latency measurements for all service calls. We chose Open-Zipkin because it had better integrations with our Spring Boot based Java runtime environment.

article thumbnail

Expanding the Cloud ? The Amazon Simple Workflow Service - All.

All Things Distributed

Today AWS launched an exciting new service for developers: the Amazon Simple Workflow Service. They must deal with the increased latency and unreliability inherent in remote communication. Tasks can be long-running, may fail, may timeout and may complete with varying throughputs and latencies. Expanding the Cloud â??

Cloud 130
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Elastic Beanstalk a la Node - All Things Distributed

All Things Distributed

I spent a lot of time talking to AWS developers, many working in the gaming and mobile space, and most of them have been finding Node.js allows these developers to handle a large number of concurrent connections with low latencies. Today, AWS Elastic Beanstalk just added support for Node.js Who is using Elastic Beanstalk?

AWS 102
article thumbnail

Analyzing a High Rate of Paging

Brendan Gregg

1072-aws (xxx) 12/18/2018 _x86_64_ (16 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 5.03 biolatency From [bcc], this eBPF tool shows a latency histogram of disk I/O. 1072-aws (xxx) 12/19/2018 _x86_64_ (16 CPU) avg-cpu: %user %nice %system %iowait %steal %idle 12.25 avg-cpu: %user %nice %system %iowait %steal %idle 14.81

Cache 105
article thumbnail

Seeing through hardware counters: a journey to threefold performance increase

The Netflix TechBlog

We decided to move one of our Java microservices?—?let’s to a larger AWS instance size, from m5.4xl (16 vCPUs) to m5.12xl (48 vCPUs). What’s worse, average latency degraded by more than 50%, with both CPU and latency patterns becoming more “choppy.” The problem It started off as a routine migration. let’s call it GS2?—?to

Hardware 363
article thumbnail

Improving the Cloud - More Efficient Queuing with SQS - All Things.

All Things Distributed

For example, AWS customers use SQS for asynchronous communication pipelines, buffer queues for databases, asynchronous work queues, and moving latency out of highly responsive requests paths. In addition to Long Polling, we are also launching richer client functionality in the Java SDK.

article thumbnail

Design Patterns: Queue-Based Load Leveling Pattern

cdemi

Apache Kafka - High-Throughput, Low-Latency, Uses Apache ZooKeeper for Distribution, Written in Scala and Java. Amazon Simple Queue Service - The Go-To choice if you're already on AWS, Reliable, Simple, Flexible, Scalable, Secure, Inexpensive.

Design 47