Big Data, Design, Efficiency and Tuning - Technology Performance Pulse

Big Data

Design

Efficiency

Tuning

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

MARCH 4, 2024

Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto testing–is key to the success of modern data platforms. the retry success probability) and compute cost efficiency (i.e., Multi-objective optimizations.

Tuning

Tuning Efficiency Big Data Engineering

Why MySQL Could Be Slow With Large Tables

Percona

JANUARY 19, 2023

For instance, in Percona Managed Services , we have many clients with TBs worth of data that are well performant. In this blog post, we will review key topics to consider for managing large datasets more efficiently in MySQL. InnoDB will sort the data in primary key order, and that will serve to reference actual data pages on disk.

Open Source

Open Source Storage Database Big Data

Join 5,000+

Insiders

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

NOVEMBER 20, 2023

We will show how we are building a clean and efficient incremental processing solution (IPS) by using Netflix Maestro and Apache Iceberg. IPS provides the incremental processing support with data accuracy, data freshness, and backfill for users and addresses many of the challenges in workflows. past 3 hours or 10 days).

Processing

Processing Big Data Efficiency Engineering

Optimizing data warehouse storage

The Netflix TechBlog

DECEMBER 21, 2020

We built AutoOptimize to efficiently and transparently optimize the data and metadata storage layout while maximizing their cost and performance benefits. This article will list some of the use cases of AutoOptimize, discuss the design principles that help enhance efficiency, and present the high-level architecture.

Storage

Storage Latency Efficiency Data Engineering

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

MARCH 25, 2019

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and Efficiency By: Di Lin , Girish Lingappa , Jitender Aswani Imagine yourself in the role of a data-inspired decision maker staring at a metric on a dashboard about to make a critical business decision but pausing to ask a question?—?“Can

Infrastructure

Infrastructure Big Data Transportation Architecture

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

The Netflix TechBlog

OCTOBER 27, 2020

Data scientists and engineers collect this data from our subscribers and videos, and implement data analytics models to discover customer behaviour with the goal of maximizing user joy. The processed data is typically stored as data warehouse tables in AWS S3. Moving data with Bulldozer at Netflix.

Latency

Latency Storage Big Data Tuning

Should You Use ClickHouse as a Main Operational Database?

Percona

JANUARY 14, 2019

However, ClickHouse is super efficient for timeseries and provides “sharding” out of the box (scalability beyond one node). Although such databases can be very efficient with counts and averages, some queries will be slow or simply non existent. Inserts are efficient for bulk inserts only. created_utc?? ?

Database

Database Analytics Blockchain Healthcare

Python at Netflix

The Netflix TechBlog

APRIL 29, 2019

Various software systems are needed to design, build, and operate this CDN infrastructure, and a significant number of them are written in Python. Demand Engineering Demand Engineering is responsible for Regional Failovers , Traffic Distribution, Capacity Operations and Fleet Efficiency of the Netflix cloud.

Open Source

Open Source Network Infrastructure Big Data

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

OCTOBER 18, 2022

by Jun He , Akash Dwivedi , Natallia Dzenisenka , Snehal Chennuru , Praneeth Yenugutala , Pawan Dixit At Netflix, Data and Machine Learning (ML) pipelines are widely used and have become central for the business, representing diverse use cases that go beyond recommendations, predictions and data transformations.

Java

Java Scalability Traffic Architecture

Structural Evolutions in Data

O'Reilly

SEPTEMBER 19, 2023

Each time, the underlying implementation changed a bit while still staying true to the larger phenomenon of “Analyzing Data for Fun and Profit.” ” They weren’t quite sure what this “data” substance was, but they’d convinced themselves that they had tons of it that they could monetize.

Hardware

Hardware Storage Big Data Blockchain

Streaming SQL in Data Mesh

The Netflix TechBlog

NOVEMBER 3, 2023

However, this design decision led to a different set of challenges. For more efficient schema management and evolution, the platform will automatically infer the output schema based on the fields selected by the SQL query. Stay tuned for more updates! Some teams found the provided building blocks were not expressive enough.

Processing

Processing Engineering Infrastructure Latency

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

Why MySQL Could Be Slow With Large Tables

Trending Sources

Incremental Processing using Netflix Maestro and Apache Iceberg

Optimizing data warehouse storage

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

Bulldozer: Batch Data Moving from Data Warehouse to Online Key-Value Stores

Should You Use ClickHouse as a Main Operational Database?

Python at Netflix

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

Structural Evolutions in Data

Streaming SQL in Data Mesh

Stay Connected