article thumbnail

What is IT automation?

Dynatrace

At its most basic, automating IT processes works by executing scripts or procedures either on a schedule or in response to particular events, such as checking a file into a code repository. When monitoring tools release a stream of alerts, teams can easily identify which ones are false and assess whether an event requires human intervention.

article thumbnail

3. Psyberg: Automated end to end catch up

The Netflix TechBlog

Input : List of source tables and required processing mode Output : Psyberg identifies new events that have occurred since the last high watermark (HWM) and records them in the session metadata table. Data Load Type : The ETL can either load the missed/new data specifically or reload the entire specified range.

Tuning 244
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Hyper Scale VPC Flow Logs enrichment to provide Network Insight

The Netflix TechBlog

It is easier to tune a large Spark job for a consistent volume of data. As you may know, S3 can emit messages when events (such as a file creation events) occur which can be directed into an AWS SQS queue. In other words, we are able to ensure that our Spark app does not “eat” more data than it was tuned to handle.

Network 150
article thumbnail

Optimizing data warehouse storage

The Netflix TechBlog

Some of the optimizations are prerequisites for a high-performance data warehouse. Sometimes Data Engineers write downstream ETLs on ingested data to optimize the data/metadata layouts to make other ETL processes cheaper and faster. Both automatic (event-driven) as well as manual (ad-hoc) optimization.

Storage 203
article thumbnail

Orchestrating Data/ML Workflows at Scale With Netflix Maestro

The Netflix TechBlog

It is a general-purpose workflow orchestrator that provides a fully managed workflow-as-a-service (WAAS) to the data platform at Netflix. It serves thousands of users, including data scientists, data engineers, machine learning engineers, software engineers, content producers, and business analysts, for various use cases.

Java 202
article thumbnail

Incremental Processing using Netflix Maestro and Apache Iceberg

The Netflix TechBlog

These challenges are currently addressed in suboptimal and less cost efficient ways by individual local teams to fulfill the needs, such as Lookback: This is a generic and simple approach that data engineers use to solve the data accuracy problem. Users configure the workflow to read the data in a window (e.g.

article thumbnail

Building and Scaling Data Lineage at Netflix to Improve Data Infrastructure Reliability, and…

The Netflix TechBlog

We adopted the following mission statement to guide our investments: “Provide a complete and accurate data lineage system enabling decision-makers to win moments of truth.” Data Enrichment The lineage data, when enriched with entity metadata and associated relationships, become more valuable to deliver on a rich set of business cases.