article thumbnail

Back-to-Basics Weekend Reading - The 5 Minute Rule - All Things.

All Things Distributed

Which makes this week a good moment to read up on some of the historical work around the costs of data engineering. For this purpose I have picked work based on two papers by Jim Gray , the brilliant IBM / Tandem / Microsoft researcher, who won a Turing award for his contributions to data and transaction processing.

Storage 108
article thumbnail

Kubernetes for Big Data Workloads

Abhishek Tiwari

Kubernetes has emerged as go to container orchestration platform for data engineering teams. In 2018, a widespread adaptation of Kubernetes for big data processing is anitcipated. Organisations are already using Kubernetes for a variety of workloads [1] [2] and data workloads are up next.

Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

2021 Data/AI Salary Survey

O'Reilly

The results are biased by the survey’s recipients (subscribers to O’Reilly’s Data & AI Newsletter ). Our audience is particularly strong in the software (20% of respondents), computer hardware (4%), and computer security (2%) industries—over 25% of the total. Average salary change vs. type of training. The Last Word.

Azure 145
article thumbnail

Friends don't let friends build data pipelines

Abhishek Tiwari

Unfortunately, building data pipelines remains a daunting, time-consuming, and costly activity. Not everyone is operating at Netflix or Spotify scale data engineering function. Often companies underestimate the necessary effort and cost involved to build and maintain data pipelines.

Latency 63
article thumbnail

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

While BI solutions have existed for decades, customers have told us that it takes an enormous amount of time, engineering effort, and money to bridge this gap. These solutions lack interactive data exploration and visualization capabilities, limiting most business users to canned reports and pre-selected queries.

Cloud 137
article thumbnail

5 data integration trends that will define the future of ETL in 2018

Abhishek Tiwari

Because supported big data frameworks and applications can utilize the same internal memory format, they can avoid data serialization and deserialization to convert data between various formats. In contrast, Alluxio a middleware for data access - think Alluxio storage layer as fast cache.

article thumbnail

Microservices Adoption in 2020

O'Reilly

Technical roles represented in the “Other” category include IT managers, data engineers, DevOps practitioners, data scientists, systems engineers, and systems administrators. Combined, technology verticals—software, computers/hardware, and telecommunications—account for about 35% of the audience (Figure 2).

Database 135