Skip to main content
Uber logo

Schedule rides in advance

Reserve a rideReserve a ride

Schedule rides in advance

Reserve a rideReserve a ride
Data / ML, Engineering

Less is More: Engineering Data Warehouse Efficiency with Minimalist Design

August 14, 2019 / Global
Featured image for Less is More: Engineering Data Warehouse Efficiency with Minimalist Design
Figure 1: An inverse sigmoid model is used to model the effective weight of a table. A table with a very high utility should have a low effective maintenance cost. Hence we multiply the cost of a table with the weight of the table that is modeled using an inverse sigmoid function as shown in the graph above.
Constraint 1:
Constraint 2:
Constraint 3:
Constraint 4:
Figure 2: Even when the retention rate of our database is set to 100 percent, the operational cost drops to 92 percent due to the cost of maintaining stale data. Around a 95 percent retention rate, the operational cost stabilizes, indicating that further reducing the retention rate doesn’t really help significantly reduce the operational cost.
Ritesh Agrawal

Ritesh Agrawal

Ritesh Agrawal is a senior data scientist on Uber's Data Science team, leading the intelligent infrastructure and developer platform teams. His work is focused on finding innovative ways to use data science and AI to make Uber’s infrastructure more adaptive and scalable and enhance developer productivity.

Harsha Venkat Annapa Reddy

Harsha Venkat Annapa Reddy

Harsha Venkat Annapa Reddy is a senior software engineer on Uber's Interactive SQL team.

Girish Baliga

Girish Baliga

Girish manages Pinot, Flink, and Presto teams at Uber. He is helping the team build a comprehensive self-service real-time analytics platform based on Pinot to power business-critical external facing dashboards and metrics. Girish is the Chairman of the Presto Linux Foundation Governing Board.

Posted by Ritesh Agrawal, Harsha Venkat Annapa Reddy, Girish Baliga