Data Engineering - Technology Performance Pulse

Financial Data Engineering in SAS

DZone

JANUARY 8, 2024

Financial data engineering in SAS involves the management, processing, and analysis of financial data using the various tools and techniques provided by the SAS software suite. Here are some key aspects of financial data engineering in SAS: 1.

Data Engineering

Data Engineering Engineering Database Software

Our First Netflix Data Engineering Summit

The Netflix TechBlog

DECEMBER 14, 2023

Engineers from across the company came together to share best practices on everything from Data Processing Patterns to Building Reliable Data Pipelines. The result was a series of talks which we are now sharing with the rest of the Data Engineering community! In this video, Sr. In this video, Sr.

Data Engineering

Data Engineering Engineering Software Engineering Best Practices

A Recap of the Data Engineering Open Forum at Netflix

The Netflix TechBlog

JUNE 20, 2024

A summary of sessions at the first Data Engineering Open Forum at Netflix on April 18th, 2024 The Data Engineering Open Forum at Netflix on April 18th, 2024. At Netflix, we aspire to entertain the world, and our data engineering teams play a crucial role in this mission by enabling data-driven decision-making at scale.

Data Engineering

Data Engineering Engineering Entertainment Software Engineering

1. Streamlining Membership Data Engineering at Netflix with Psyberg

The Netflix TechBlog

NOVEMBER 14, 2023

By Abhinaya Shetty , Bharath Mummadisetty At Netflix, our Membership and Finance Data Engineering team harnesses diverse data related to plans, pricing, membership life cycle, and revenue to fuel analytics, power various dashboards, and make data-informed decisions.

Data Engineering

Data Engineering Engineering Processing Games

Data Engineers of Netflix?—?Interview with Kevin Wylie

The Netflix TechBlog

JULY 15, 2021

Data Engineers of Netflix?—?Interview Interview with Kevin Wylie This post is part of our “Data Engineers of Netflix” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Kevin, what drew you to data engineering?

Data Engineering

Data Engineering Engineering Entertainment Big Data

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

The Netflix TechBlog

OCTOBER 28, 2021

Data Engineers of Netflix?—?Interview Interview with Pallavi Phadnis This post is part of our “ Data Engineers of Netflix ” series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Pallavi Phadnis is a Senior Software Engineer at Netflix.

Data Engineering

Data Engineering Engineering Big Data Software Engineering

Automated Testing in Data Engineering: An Imperative for Quality and Efficiency

DZone

JANUARY 9, 2024

This holds true for the critical field of data engineering as well. As organizations gather and process astronomical volumes of data, manual testing is no longer feasible or reliable. This comprehensive guide takes an in-depth look at automated testing in the data engineering domain.

Data Engineering

Data Engineering Efficiency Engineering Testing

Leveraging Infrastructure as Code for Data Engineering Projects: A Comprehensive Guide

DZone

JULY 3, 2023

Data engineering projects often require the setup and management of complex infrastructures that support data processing, storage, and analysis. In this article, we will explore the benefits of leveraging IaC for data engineering projects and provide detailed implementation steps to get started.

Data Engineering

Data Engineering Infrastructure Engineering Code

Data Engineers of Netflix?—?Interview with Dhevi Rajendran

The Netflix TechBlog

JUNE 1, 2021

Data Engineers of Netflix?—?Interview Interview with Dhevi Rajendran Dhevi Rajendran This post is part of our “Data Engineers of Netflix” interview series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. Data Engineers of Netflix?—?Interview

Data Engineering

Data Engineering Engineering Software Engineering Big Data

Data Engineers of Netflix?—?Interview with Samuel Setegne

The Netflix TechBlog

JUNE 1, 2021

Data Engineers of Netflix?—?Interview Interview with Samuel Setegne Samuel Setegne This post is part of our “Data Engineers of Netflix” interview series, where our very own data engineers talk about their journeys to Data Engineering @ Netflix. What drew you to Netflix?

Data Engineering

Data Engineering Engineering Big Data Healthcare

Chaos Data Engineering Manifesto: 5 Laws for Successful Failures

DZone

FEBRUARY 27, 2023

It's midnight in the dim and cluttered office of The New York Times, currently serving as the "situation room." A powerful surge of traffic is inevitable. During every major election, the wave would crest and crash against our overwhelmed systems before receding, allowing us to assess the damage.

Data Engineering

Data Engineering Engineering Traffic Systems

Overcoming Challenges and Best Practices for Data Migration From On-Premise to Cloud

DZone

MARCH 29, 2023

This article discusses the challenges and best practices of data migration when transferring on-premise data to the cloud. The article will also explore the role of data engineering in ensuring successful data transfer and integration and different approaches to data migration.

Best Practices

Best Practices Cloud Storage Data Engineering

Offline Data Pipeline Best Practices Part 1:Optimizing Airflow Job Parameters for Apache Hive

DZone

DECEMBER 27, 2023

Welcome to the first post in our exciting series on mastering offline data pipeline's best practices, focusing on the potent combination of Apache Airflow and data processing engines like Hive and Spark. Working together, they form the backbone of many modern data engineering solutions.

Best Practices

Best Practices Data Engineering Big Data Games

Bringing Software Engineering Rigor to Data

DZone

FEBRUARY 20, 2023

The data community is striving to incorporate the core concepts of engineering rigor found in software communities but still has further to go. This is achieved through practices like Infrastructure as Code for deployments, automated testing, application observability, and end-to-end application lifecycle ownership.

Software Engineering

Software Engineering Engineering Software Software

Choosing an OLAP Engine for Financial Risk Management: What To Consider?

DZone

AUGUST 16, 2023

From a data engineer's point of view, financial risk management is a series of data analysis activities on financial data. The financial sector imposes its unique requirements on data engineering.

FinTech

FinTech Engineering Data Engineering Latency

The 31 Flavors of Data Lineage and Why Vanilla Doesn’t Cut It

DZone

JANUARY 27, 2023

Data lineage, an automated visualization of the relationships for how data flows across tables and other data assets, is a must-have in the data engineering toolbox.

Government

Government Data Engineering Engineering

SIEM Volume Spike Alerts Using ML

DZone

JANUARY 31, 2024

Problem Statement In Data Engineering , the data/log collection is a challenging task for high-volume sources. Compliance Reporting: SIEM solutions help organizations meet regulatory compliance requirements by providing reporting and audit trail capabilities.

Storage

Storage Data Engineering Network Infrastructure

2. Diving Deeper into Psyberg: Stateless vs Stateful Data Processing

The Netflix TechBlog

NOVEMBER 14, 2023

By Abhinaya Shetty , Bharath Mummadisetty In the inaugural blog post of this series, we introduced you to the state of our pipelines before Psyberg and the challenges with incremental processing that led us to create the Psyberg framework within Netflix’s Membership and Finance data engineering team.

Processing

Processing Data Engineering Efficiency Analytics

How TripleLift Built an Adtech Data Pipeline Processing Billions of Events Per Day

High Scalability

JUNE 15, 2020

This is a guest post by Eunice Do , Data Engineer at TripleLift , a technology company leading the next generation of programmatic advertising. The system is the data pipeline at TripleLift. TripleLift is an adtech company, and like most companies in this industry, we deal with high volumes of data on a daily basis.

Processing

Processing Data Engineering Engineering Efficiency

Data Ingestion: The First Step Towards a Flawless Data Pipeline

Simform

JANUARY 8, 2023

Data ingestion is the foremost layer in a data engineering pipeline, acting as a vital pillar in the overall analytics architecture. Thus, it is essential to implement data ingestion just right. Here is everything you need to know to take the first step toward a flawless data pipeline.

Data Engineering

Data Engineering Analytics Architecture Engineering

3. Psyberg: Automated end to end catch up

The Netflix TechBlog

NOVEMBER 14, 2023

This helps overwrite data only when required and minimizes unnecessary reprocessing. As seen above, by chaining these Psyberg workflows, we could automate the catchup for late-arriving data from hours 2 and 6. The Data Engineer does not need to perform any manual intervention in this case and can thus focus on more important things!

Tuning

Tuning Processing C++ Efficiency

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

The Netflix TechBlog

MARCH 5, 2019

While our engineering teams have and continue to build solutions to lighten this cognitive load (better guardrails, improved tooling, …), data and its derived products are critical elements to understanding, optimizing and abstracting our infrastructure. Give us a holler if you are interested in a thought exchange.

Infrastructure

Infrastructure Cloud Scalability AWS

What is IT automation?

Dynatrace

JULY 6, 2022

This requires significant data engineering efforts, as well as work to build machine-learning models. While automating IT processes without integrated AIOps can create challenges, the approach to artificial intelligence itself can also introduce potential issues. AI that is based on machine learning needs to be trained.

Artificial Intelligence

Artificial Intelligence Tuning Strategy Big Data

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

The Netflix TechBlog

MARCH 2, 2021

At Netflix, our data scientists span many areas of technical specialization, including experimentation, causal inference, machine learning, NLP, modeling, and optimization. Together with data analytics and data engineering, we comprise the larger, centralized Data Science and Engineering group.

Analytics

Analytics C++ Innovation Engineering

Engineering Data Reliably Using SLO Theory – Percona Live ONLINE Talk Preview

Percona Community

OCTOBER 14, 2020

next day) Abstract Not so long ago, operations specialists worked much like today’s data engineers do: with specialized skills, they were the people who kept sites running, who responded to emergencies, and who—unfortunately—spent much of their time dealing with incidents and other “fires. London 5:30 p.m. •

Engineering

Engineering DevOps Data Engineering

Percona Live Europe Tutorial: Elasticsearch 101

Percona Community

OCTOBER 3, 2018

For Percona Live Europe, I’ll be presenting the tutorial Elasticsearch 101 alongside my colleagues and fellow presenters from ObjectRocket Alex Cercel, DBA, and Mihai Aldoiu, Data Engineer. Here’s a brief overview of our tutorial.

Data Engineering

Data Engineering Scalability Engineering

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

MARCH 7, 2024

Since then, open-source Metaflow has gained support for Argo Workflows , a Kubernetes-native orchestrator, as well as support for Airflow which is still widely used by data engineering teams. Internally, we use a production workflow orchestrator called Maestro.

Systems

Systems Media Cache Open Source

Post: Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

MARCH 17, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Engineering Java Software Engineering

Post: Essilen Research, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

MARCH 3, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Engineering Games Java

Post: Essilen Research, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

FEBRUARY 18, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Engineering Games Java

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

APRIL 14, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Software Engineering Scalability Engineering

Shadows

The Agile Manager

JULY 31, 2022

There are shadow IT teams of developers or data engineers that spring up in areas like operations or marketing because the captive IT function is slow, if not outright incapable, of responding to internal customer demand. IT shadows appear in a lot of different forms.

Programming

Programming Engineering Data Engineering Innovation

Post: Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

MARCH 24, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Software Engineering Engineering Big Data

Post: Essilen Research, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

FEBRUARY 9, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Engineering Games Java

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

APRIL 28, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Software Engineering Scalability Engineering

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

High Scalability

MARCH 30, 2020

Etleap is analyst-friendly , enterprise-grade ETL-as-a-service , built for Redshift and Snowflake data warehouses and S3/Glue data lakes. Our intuitive software allows data engineers to maintain pipelines without writing code, and lets analysts gain access to data in minutes instead of months.

Education

Education Software Engineering Engineering Big Data

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

MARCH 4, 2024

We would like to appreciate all folks, including Spark experts, data scientists, ML engineers, the scheduler and job orchestrator engineers, data engineers, and support engineers, for sharing the context and providing constructive suggestions and valuable feedback (e.g.,

Tuning

Tuning Efficiency Big Data Engineering

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

1:45pm-2:45pm NFX 201 More Data Science with less engineering: ML Infrastructure Ville Tuulos , Machine Learning Infrastructure Engineering Manager Abstract : Netflix is known for its unique culture that gives an extraordinary amount of freedom to individual engineers and data scientists.

AWS

AWS Entertainment Open Source Benchmarking

Netflix at AWS re:Invent 2019

The Netflix TechBlog

NOVEMBER 22, 2019

1:45pm-2:45pm NFX 201 More Data Science with less engineering: ML Infrastructure Ville Tuulos , Machine Learning Infrastructure Engineering Manager Abstract : Netflix is known for its unique culture that gives an extraordinary amount of freedom to individual engineers and data scientists.

AWS

AWS Entertainment Open Source Benchmarking

Back-to-Basics Weekend Reading - The 5 Minute Rule - All Things.

All Things Distributed

AUGUST 24, 2012

Which makes this week a good moment to read up on some of the historical work around the costs of data engineering. For this purpose I have picked work based on two papers by Jim Gray , the brilliant IBM / Tandem / Microsoft researcher, who won a Turing award for his contributions to data and transaction processing.

Storage

Storage Hardware AWS Data Engineering

AI meets operations

O'Reilly

FEBRUARY 2, 2020

Collaboration between AI developers and operations teams will lead to growing pains on both sides, especially since many data scientists and AI researchers have had limited exposure to, or knowledge of, software engineering. It’s going to be an interesting few years as operations assimilates AI.

Software Architecture

Software Architecture Monitoring Software Engineering Architecture

Kubernetes for Big Data Workloads

Abhishek Tiwari

DECEMBER 27, 2017

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next.

Big Data

Big Data Storage Benchmarking Hardware

Financial Data Engineering in SAS

Our First Netflix Data Engineering Summit

Trending Sources

A Recap of the Data Engineering Open Forum at Netflix

1. Streamlining Membership Data Engineering at Netflix with Psyberg

Data Engineers of Netflix?—?Interview with Kevin Wylie

Data Engineers of Netflix?—?Interview with Pallavi Phadnis

Automated Testing in Data Engineering: An Imperative for Quality and Efficiency

Leveraging Infrastructure as Code for Data Engineering Projects: A Comprehensive Guide

Data Engineers of Netflix?—?Interview with Dhevi Rajendran

Data Engineers of Netflix?—?Interview with Samuel Setegne

Chaos Data Engineering Manifesto: 5 Laws for Successful Failures

Overcoming Challenges and Best Practices for Data Migration From On-Premise to Cloud

Offline Data Pipeline Best Practices Part 1:Optimizing Airflow Job Parameters for Apache Hive

Bringing Software Engineering Rigor to Data

Choosing an OLAP Engine for Financial Risk Management: What To Consider?

The 31 Flavors of Data Lineage and Why Vanilla Doesn’t Cut It

SIEM Volume Spike Alerts Using ML

2. Diving Deeper into Psyberg: Stateless vs Stateful Data Processing

How TripleLift Built an Adtech Data Pipeline Processing Billions of Events Per Day

Data Ingestion: The First Step Towards a Flawless Data Pipeline

3. Psyberg: Automated end to end catch up

How Data Inspires Building a Scalable, Resilient and Secure Cloud Infrastructure At Netflix

What is IT automation?

A Day in the Life of an Experimentation and Causal Inference Scientist @ Netflix

Engineering Data Reliably Using SLO Theory – Percona Live ONLINE Talk Preview

Percona Live Europe Tutorial: Elasticsearch 101

Supporting Diverse ML Systems at Netflix

Post: Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: Essilen Research, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: Essilen Research, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Shadows

Sponsored Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: Essilen Research, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

Sponsored Post: InterviewCamp.io, Scrapinghub, Fauna, Sisu, Educative, PA File Sight, Etleap, Triplebyte, Stream

Netflix at AWS re:Invent 2019

Netflix at AWS re:Invent 2019

Back-to-Basics Weekend Reading - The 5 Minute Rule - All Things.

AI meets operations

Kubernetes for Big Data Workloads

Stay Connected