article thumbnail

Offline Data Pipeline Best Practices Part 1:Optimizing Airflow Job Parameters for Apache Hive

DZone

Welcome to the first post in our exciting series on mastering offline data pipeline's best practices, focusing on the potent combination of Apache Airflow and data processing engines like Hive and Spark. Working together, they form the backbone of many modern data engineering solutions.

article thumbnail

Around the World in 28 Days - All Things Distributed

All Things Distributed

With existing customers I get a change to dive deep on their AWS usage and understand what works well and where we can do better. There is huge variety in exiting architectures and I am often impressed about the ingenuity of the engineers in how to best transform the application if "Lift & Shift" is not an option. . | Comments ().

AWS 75
article thumbnail

Reduce RPO, Encrypt Backups, and More in 1.15.0 Release of Percona Operator for MongoDB

Percona

release , we added support for physical backups and restores to significantly reduce Recovery Time Objective ( RTO ), especially for big data sets. However, the problem of losing data between backups – in other words, Recovery Point Objective (RPO) – for physical backups was not solved. serverSideEncryption section.