article thumbnail

Medallion Architecture: Efficient Batch and Stream Processing Data Pipelines With Azure Databricks and Delta Lake

DZone

In today's data-driven world, organizations need efficient and scalable data pipelines to process and analyze large volumes of data. This article explores the concepts of Medallion Architecture and demonstrates how to implement batch and stream processing pipelines using Azure Databricks and Delta Lake.

Azure 246
article thumbnail

Building an Optimized Data Pipeline on Azure Using Spark, Data Factory, Databricks, and Synapse Analytics

DZone

Modern tech stacks such as Apache Spark, Azure Data Factory, Azure Databricks, and Azure Synapse Analytics offer powerful tools for building optimized data pipelines that can efficiently ingest and process data on the cloud.

Azure 246
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

New Azure Cosmos DB Features to Boost Performance and Optimize Cost

InfoQ

Microsoft has recently unveiled several new features for Azure Cosmos DB to enhance cost efficiency, boost performance, and increase elasticity. These features are burst capacity, hierarchical partition keys, serverless container storage of 1 TB, and priority-based execution. By Steef-Jan Wiggers

Azure 85
article thumbnail

No need to compromise visibility in public clouds with new Azure services supported by Dynatrace (Part 2)

Dynatrace

This is the second part of our blog series announcing the massive expansion of our Azure services support. Part 1 of this blog series looks at some of the key benefits of Azure DB for PostgreSQL, Azure SQL Managed Instance, and Azure HDInsight. Fully automated observability into your Azure multi-cloud environment.

Azure 155
article thumbnail

Building an elastic query engine on disaggregated storage

The Morning Paper

Building an elastic query engine on disaggregated storage , Vuppalapati, NSDI’20. Snowflake is a data warehouse designed to overcome these limitations, and the fundamental mechanism by which it achieves this is the decoupling (disaggregation) of compute and storage. joins) during query processing. Disaggregation (or not).

Storage 112
article thumbnail

Implementing AWS well-architected pillars with automated workflows

Dynatrace

This is a set of best practices and guidelines that help you design and operate reliable, secure, efficient, cost-effective, and sustainable systems in the cloud. The framework comprises six pillars: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability.

AWS 247
article thumbnail

What is a Distributed Storage System

Scalegrid

A distributed storage system is foundational in today’s data-driven landscape, ensuring data spread over multiple servers is reliable, accessible, and manageable. Understanding distributed storage is imperative as data volumes and the need for robust storage solutions rise.

Storage 130