Big Data, Event and Traffic - Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

MAY 4, 2023

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This approach has a handful of benefits.

Traffic

Traffic Latency Tuning Systems

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Uber Engineering

OCTOBER 17, 2018

To accomplish this, Uber relies heavily on making data-driven decisions at every level, from forecasting rider demand during high traffic events to identifying and addressing bottlenecks … The post Uber’s Big Data Platform: 100+ Petabytes with Minute Latency appeared first on Uber Engineering Blog.

Big Data

Big Data Latency Transportation Traffic

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

MAY 26, 2020

VPC Flow Logs VPC Flow Logs is an AWS feature that captures information about the IP traffic going to and from network interfaces in a VPC. At Netflix we publish the Flow Log data to Amazon S3. It is easier to tune a large Spark job for a consistent volume of data. These events represent a specific cut of data from the table.

Network

Network Tuning AWS Big Data

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

Data Movement in Netflix Studio via Data Mesh

The Netflix TechBlog

JULY 26, 2021

With this batch style approach, several issues have surfaced like data movement is tightly coupled with database tables, database schema is not an exact mapping of business data model, and data being stale given it is not real time etc. As of now, CDC sources have been implemented for data stores at Netflix (MySQL, Postgres).

Big Data

Big Data Government Analytics Processing

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

The Morning Paper

MAY 14, 2019

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices Gan et al., For the services under study, Seer has a sweet spot when trained with around 100GB of data and a 100ms sampling interval for measuring queue depths. ASPLOS’19. Distributed tracing and instrumentation.

Big Data

Big Data Cloud Performance Hardware

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Dynatrace

FEBRUARY 16, 2023

As teams try to gain insight into this data deluge, they have to balance the need for speed, data fidelity, and scale with capacity constraints and cost. To solve this problem, Dynatrace launched Grail, its causational data lakehouse , in 2022.

Analytics

Analytics Innovation Metrics Database

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

Demand Engineering Demand Engineering is responsible for Regional Failovers , Traffic Distribution, Capacity Operations and Fleet Efficiency of the Netflix cloud. Orchestration The Big Data Orchestration team is responsible for providing all of the services and tooling to schedule and execute ETL and Adhoc pipelines.

Open Source

Open Source Network Infrastructure Big Data

The Need for Real-Time Device Tracking

ScaleOut Software

JULY 19, 2021

And it can maintain contextual information about every data source (like the medical history of a device wearer or the maintenance history of a refrigeration system) and keep it immediately at hand to enhance the analysis.

IoT

IoT Analytics Big Data Architecture

Delta: A Data Synchronization and Enrichment Platform

The Netflix TechBlog

OCTOBER 15, 2019

Beyond data synchronization, some applications also need to enrich their data by calling external services. Delta is an eventual consistent, event driven, data synchronization and enrichment platform. CDC (Change-Data-Capture) events are sent by the Delta-Connector to a Keystone Kafka topic.

Transportation

Transportation Architecture Processing Storage

Redis vs Memcached in 2024

Scalegrid

MARCH 28, 2024

On the other hand, an append-only file ensures data safety by recording every write operation that modifies the dataset, allowing for complete data reconstruction in the event of a restart. Advanced Redis Features Showdown Big data center concept, cloud database, server power station of the future. 3d render.

Cache

Cache Storage Scalability Architecture

Probabilistic Data Structures for Web Analytics and Data Mining

Highly Scalable

MAY 1, 2012

Let us start with a simple example that illustrates capabilities of probabilistic data structures: Let us have a data set that is simply a heap of ten million random integer values and we know that it contains not more than one million distinct values (there are many duplicates). what is the cardinality of the data set)?

Analytics

Analytics Traffic Big Data Efficiency

Expanding the Cloud – An AWS Region is coming to Hong Kong

All Things Distributed

JUNE 20, 2017

9GAG is a Hong Kong-based company responsible for 9gag.com , one of the top traffic websites in the world. Beyond running their web properties and applications, Next Digital also uses Amazon RDS (database), Amazon ElastiCache (caching), and Amazon Redshift (data warehousing).

AWS

AWS Logistics Cloud Social Media

I Used The Web For A Day On A 50 MB Budget

Smashing Magazine

JULY 29, 2019

In Amazon’s case, there is room to make some big data savings on the desktop site and we shouldn’t get complacent just because the screen size suggests I’m not on a mobile device. This header explicitly indicates a preference for reduced data usage , and I hope more websites start to take notice of it in the future.

Cache

Cache Google Mobile Network

Technology Performance Pulse

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

Uber’s Big Data Platform: 100+ Petabytes with Minute Latency

Trending Sources

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Data Movement in Netflix Studio via Data Mesh

Seer: leveraging big data to navigate the complexity of performance debugging in cloud microservices

Data lakehouse innovations advance the three pillars of observability for more collaborative analytics

Python at Netflix

The Need for Real-Time Device Tracking

Delta: A Data Synchronization and Enrichment Platform

Redis vs Memcached in 2024

Probabilistic Data Structures for Web Analytics and Data Mining

Expanding the Cloud – An AWS Region is coming to Hong Kong

I Used The Web For A Day On A 50 MB Budget

Stay Connected