Checksums in Storage Systems and Why the Enterprise Should Care


It’s really scary knowing that such corruptions are happening in the memory of our computers and servers – that is before they even reach the network and storage portions of the stack. Let’s assume for a moment that your data survives its many passes through a system’s DRAM and emerges intact. That data must then be safely transported over a network to the storage system where it is written to disk. performance storage database checksum data corruption data safety

Partitioned Hive Table Across Storage Systems Using Alluxio


However, Hive cannot access a single table directly using a single query with the data of this Hive table across different mediums of storage and different clusters. This becomes a need when the data volume grows too large to fit a single medium of storage or cluster, and also when the users need to take into account the following considerations: Storage cost, where some partitions are less important than others and can be stored on cheaper storage tiers.

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

File systems unfit as distributed storage backends: lessons from ten years of Ceph evolution

The Morning Paper

File systems unfit as distributed storage backends: lessons from 10 years of Ceph evolution Aghayev et al., In this case, the assumption that a distributed storage backend should clearly be layered on top of a local file system. Breaking that assumption allowed Ceph to introduce a new storage backend called BlueStore with much better performance and predictability, and the ability to support the changing storage hardware landscape. Uncategorized Storage

Building an elastic query engine on disaggregated storage

The Morning Paper

Building an elastic query engine on disaggregated storage , Vuppalapati, NSDI’20. have altered the many assumptions that guided the design and optimization of the Snowflake system. Traditional data warehouse systems are largely based on shared-nothing designs: persistent data is partitioned across a set of nodes, each responsible for its local data. But the ephemeral storage service for intermediate data is not based on S3.

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

High Scalability

Redis Cluster is the native sharding implementation available within Redis that allows you to automatically distribute your data across multiple nodes without having to rely on external tools and utilities. At ScaleGrid, we recently added support for Redis Clusters on our platform through our fully managed Redis hosting plans.

Follower Clusters – 3 Major Use Cases for Syncing SQL & NoSQL Deployments


Follower clusters are a ScaleGrid feature that allows you to keep two independent database systems (of the same type) in sync. Here are a few critical ways in which it differs from replication: You can control how frequently the destination system syncs from source – once a week, once a day, or even less frequently. This helps reduce the load on the source system. Since they are two independent systems, you have much more flexibility over the data that is synced.

2019 PostgreSQL Trends Report: Private vs. Public Cloud, Migrations, Database Combinations & Top Reasons Used

High Scalability

PostgreSQL is an open source object-relational database system that has soared in popularity over the past 30 years from its active, loyal, and growing community. For the 2nd year in a row, PostgreSQL has kept the title of #1 fastest growing database in the world according to the DBMS of the Year report by the experts at DB-Engines. So what makes PostgreSQL so special, and how is it being used today?

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber Engineering

Three years ago, Uber Engineering adopted Hadoop as the storage ( HDFS ) and compute ( YARN ) infrastructure for our organization’s big data analysis. This analysis powers our services and enables the delivery of more seamless and reliable user … The post Scaling Uber’s Apache Hadoop Distributed File System for Growth appeared first on Uber Engineering Blog.

Teaching rigorous distributed systems with efficient model checking

The Morning Paper

Teaching rigorous distributed systems with efficient model checking Michael et al., It describes the labs environment, DSLabs , developed at the University of Washington to accompany a course in distributed systems. During the ten week course, students implement four different assignments: an exactly-once RPC protocol; a primary-backup system; Paxos; and a scalable, transactional key-value storage system. A visual debugger/system explorer.

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. That file is stored in our object storage service, which splits and encrypts the file into separate chunks, storing the chunks in Amazon S3. distributed-systems video-encoding algorithms python amazon-s3

Media 183

What is Greenplum Database? Intro to the Big Data Database


High performance, query optimization, open source and polymorphic data storage are the major Greenplum advantages. The MPP system leverages a shared-nothing architecture to handle multiple operations in parallel. Typically an MPP system has one leader node and one or many compute nodes. This allows Greenplum to distribute the load between their different segments and use all of the system’s resources parallely to process a query. Polymorphic Data Storage.

MySQL High Availability Framework Explained – Part III: Failover Scenarios

High Scalability

Thus, whenever a master MySQL goes down (whether due to a MySQL crash, OS crash, system reboot, etc.), This ensures that the system continues to be available to the applications. This is a classical problem in any distributed system where each node thinks the other nodes are down, while in reality, only the network communication between the nodes is broken.

Expanding the Cloud - Amazon S3 Reduced Redundancy Storage.

All Things Distributed

Werner Vogels weblog on building scalable and robust distributed systems. Expanding the Cloud - Amazon S3 Reduced Redundancy Storage. Today a new storage option for Amazon S3 has been launched: Amazon S3 Reduced Redundancy Storage (RRS). This new storage option enables customers to reduce their costs by storing non-critical, reproducible data at lower levels of redundancy. Under the covers Amazon S3 is a marvel of distributed systems technologies.

Fine-grained, secure and efficient data provenance on blockchain systems

The Morning Paper

Fine-grained, secure and efficient data provenance on blockchain systems Ruan et al., That’s hard to do in today’s blockchain systems for two reasons: Provenance can only be determined by querying and replaying all on-chain transactions, which is inefficient and an offline activity. The deployment fee should reflect this extra storage cost for DASL… Implementation and evaluation. VLDB’19.

SQL Server: OS Error 665 (File System Limitation) and Linux

SQL Server According to Bob

I have previously tested and blogged about the NTFS, sparse, file system limitation error 665: [link] when running DBCC or using Snapshot databases with SQL Server. This paper details XFS storage format and structures: [link] – Specifically the theoretical limits table section. The filefrag utility is another handy utility for viewing the storage layout.

Approaches to System Security: Using Cryptographic Techniques to Minimize Trust

ACM Sigarch

This is the first post in a series of posts on different approaches to systems security especially as they apply to hardware and architectural security. In this post, we will consider the use of mathematics/cryptography as an approach to improving systems security. The class of techniques described in this blog post, which we broadly refer to as applied hardware and architecture cryptography, apply proven cryptographic techniques to strengthen systems.

Using JSONB in PostgreSQL: How to Effectively Store & Index JSON Data in PostgreSQL


Often times an external system is providing data as JSON, so it might be a temporary store before data is ingested into other parts of the system. JSONB storage has some drawbacks vs. traditional columns: PostreSQL does not store column statistics for JSONB columns. JSONB storage results in a larger storage footprint. JSONB storage does not deduplicate the key names in the JSON. If that doesn’t work, the data is moved to out-of-line storage.

Expanding the Cloud ? Managing Cold Storage with Amazon Glacier

All Things Distributed

Werner Vogels weblog on building scalable and robust distributed systems. Managing Cold Storage with Amazon Glacier. With the introduction of Amazon Glacier , IT organizations now have a solution that removes the headaches of digital archiving and provides extremely low cost storage. Building and managing archive storage that needs to remain operational for decades if not centuries is a major challenge for most organizations. A Complete Storage Solution.

Who monitors the monitoring systems?

Adrian Cockcroft

In reality, in any non-trivial installation, there are multiple tools collecting, storing and displaying overlapping sets of metrics from many types of systems and different levels of abstraction. This is true for CPU, network, memory and storage, and also for bare metal, virtual machines, containers, processes, functions and threads. What if your monitoring systems fail? How do you even know when a monitoring system has failed? “Quis custodiet ipsos custodes?”?—?Juvenal

Back-to-Basics Weekend Reading - A Decomposition Storage Model

All Things Distributed

Not everybody agreed that the "N-ary Storage Model" (NSM) was the best approach for all workloads but it stayed dominant until hardware constraints, especially on caches, forced the community to revisit some of the alternatives. Combined with the rise of data warehouse workloads, where there is often significant redundancy in the values stored in columns, and database models based on column oriented storage took off. A Decomposition Storage Model , George P.

Azure Storage Persistence now faster in NServiceBus 6

Particular Software

If you're using Azure Storage Persistence and haven't upgraded to NServiceBus 6 yet, get ready for a tremendous performance boost for your application when you do especially if you make use of sagas. In the previous version of Azure Storage Persistence, looking up a saga by a correlation property was not as fast as looking it up by SagaId. Azure Table Storage, where saga data is stored, is limited to indexing on two columns: the Partition Key and the Row Key.

Driving Storage Costs Down for AWS Customers - All Things.

All Things Distributed

Werner Vogels weblog on building scalable and robust distributed systems. Driving Storage Costs Down for AWS Customers. As we showed last week one of the services that is growing rapidly is the Amazon Simple Storage Service (S3). AWS today announced a substantial price drop per February 1, 2012 for Amazon S3 standard storage to help customers drive their storage cost down. Other storage tiers may see even greater cost savings. All Things Distributed.

Ginseng: keeping secrets in registers when you distrust the operating system

The Morning Paper

Ginseng: keeping secrets in registers when you distrust the operating system Yun & Zhong et al., Suppose you did go to the extreme length of establishing an unconditional root of trust for your system, even then, unless every subsequent piece of code you load is also fully trusted (e.g., Such secrets are often protected by encryption in the storage. In doing so, the app assumes that the operating system (OS) is trustworthy.

Virtual consensus in Delos

The Morning Paper

While ultimately this new system should be able to take advantage of the latest advances in consensus for improved performance, that’s not realistic given a 6-9 month in-production target. replacing Paxos with Raft), or they could be shims over external storage systems.

Corporate Middle Management as an Autopoietic System

The Agile Manager

[T]he aim of such systems is ultimately to produce themselves: their own organization and identity is their most important product. -- Gareth Morgan, Images of Organization , p. This is in contrast to allopoietic systems, which use components (raw materials such as silicon and plastic) to generate something (mobile phones and computers) which are distinct from the thing that created it (the factory where they are made). The system thus organizes its environment as part of itself.

Maximizing fun (and profit) in your distributed systems

Particular Software

Based on our experience running business systems in production, we know we need to monitor our theme park to make sure it's working properly. We can use this data to extrapolate when we need to upgrade the electrical system, add a new water pipe, add more bays to our carpark, or commission more trucks to haul away our trash. How many CPU cycles is a system using? Infrastructure monitoring tools generally treat systems as "black boxes" that consume resources.

Content Management Systems of the Future: Headless, JAMstack, ADN and Functions at the Edge

Abhishek Tiwari

Recently I was asked about content management systems (CMS) of the future - more specifically how they are evolving in the era of microservices, APIs, and serverless computing. Raw content data along with templates are version controlled using Git or similar versioning systems. Alternatively, you can upload output directory to cloud object/blob storage such as Amazon S3 or Azure Blob Storage and serve your site from there.

Best Practices for Efficient Log Management and Monitoring


With so much flux and complexity across a cloud-native system, it's important to have robust monitoring and logging in place to control and manage the inevitable chaos. performance monitoring apm log management log efficient log management and monitoring log management best practices log storageWhen managing cloud-native applications, it's essential to have end-to-end visibility into what's happening at any given time.

Delta: A Data Synchronization and Enrichment Platform

The Netflix TechBlog

Another issue exists for the capture of schema changes, where some systems, like MySQL, don’t support transactional schema changes [1][2]. By their nature, they can only rely on the lowest common denominator of participating systems. Also, certain systems like ElasticSearch, do not support XA or any other heterogeneous transaction model. Thus, ensuring the atomicity of writes across different storage technologies remains a challenging problem for applications [3].

Evolution of Netflix Conductor:

The Netflix TechBlog

External Payload Storage External payload storage was implemented to prevent the usage of Conductor as a data persistence system and to reduce the pressure on its backend datastore. Workflow Status Listener Conductor can be configured to publish notifications to external systems or queues upon completion/termination of workflows. The workflow status listener provides hooks to connect to any notification system of your choice.

Lambda 162

Local-first software: you own your data, in spite of the cloud

The Morning Paper

On the other hand we have good old-fashioned native apps that you install on your operating system (a dying breed? With a traditional OS app 2 you have much more control over the data (the files on your file system at least, which if you’re lucky might even be in an open format). Operations can be handled by reading and writing to the local file system, with data synchronisation happening in the background. Uncategorized Distributed Systems Software Engineering

Seamless offloading of web app computations from mobile device to edge clouds via HTML5 Web Worker migration

The Morning Paper

The Mobile Web Worker (MWW) System. The Mobile Web Worker (MWW) System introduces a new client-side Mobile Web Worker Manager component which is responsible for managing web workers, including their migration when this is estimated to be beneficial. The current system assumes an application specific regression model is available on the servers which can predict processing time given the current parameters of the job (e.g. Uncategorized Distributed Systems

Cloudburst: stateful functions-as-a-service

The Morning Paper

The canononical cloud platform architecture decouples storage and compute services so that each can be scaled and operated independently, i.e., they are disaggregated. It’s disaggregated in the sense that storage and compute can be provisioned and billed independently, but physically colocated in the sense that hot data can be cached locally next to the function runtimes that access it. Uncategorized Distributed Systems

Cache 64

Mergeable replicated data types – Part II

The Morning Paper

An OCaml compiler extension for generating merge functions, and also for serializing and deserializing data structures for replication, using the third component of Quark… A content-addressable distributed storage abstraction, called the Quark store. Uncategorized Algorithms and data structures Distributed SystemsMergeable replicated data types – part II Kaki et al., OOPLSA ’19.

Distributed consensus revised – Part I

The Morning Paper

As the title suggests, the topic in hand is distributed consensus: Single-valued agreement is often overlooked in the literature as already solved or trivial and is seldom considered at length, despite being a vital component in distributed systems which is infamously poorly understood… we undertake an extensive examination of how to achieve consensus over a single value. Uncategorized Distributed SystemsDistributed consensus revised Howard, PhD thesis.

tempdb Enhancements in SQL Server 2019

SQL Performance

The latest adaptation by the SQL Server team is moving the system tables (metadata) for tempdb to In-Memory OLTP (aka memory-optimized). ALTER DATABASE [ WideWorldImporters ] SET QUERY_STORE ( OPERATION_MODE = READ_WRITE , CLEANUP_POLICY = ( STALE_QUERY_THRESHOLD_DAYS = 30 ) , DATA_FLUSH_INTERVAL_SECONDS = 600 , INTERVAL_LENGTH_MINUTES = 10 , MAX_STORAGE_SIZE_MB = 1024 , QUERY_CAPTURE_MODE = AUTO , SIZE_BASED_CLEANUP_MODE = AUTO ) ; GO.

CheriABI: enforcing valid pointer provenance and minimizing pointer privilege in the POSIX C run-time environment

The Morning Paper

Last week we saw the benefits of rethinking memory and pointer models at the hardware level when it came to object storage and compression ( Zippads ). And this all has to work for whole-system executions, not just the C-language portion of user processes. The MIPS rows show the test suite results on a standard mips64 system. faster for system calls). Uncategorized Operating Systems Security

C++ 64

A persistent problem: managing pointers in NVM

The Morning Paper

At the start of November I was privileged to attend HPTS (the High Performance Transaction Systems) conference in Asilomar. Byte-addressable non-volatile memory,) NVM will fundamentally change the way hardware interacts, the way operating systems are designed, and the way applications operate on data. This means that the overheads of system calls become much more noticeable. " Uncategorized Hardware Operating Systems

"0 to 60" : Switching to indirect checkpoints

SQL Performance

I was a bit perplexed by this issue, since the system was certainly no slouch — plenty of cores, 3TB of memory, and XtremIO storage. SQL Performance System Configuration checkpoint indirect checkopintIn a recent tip , I described a scenario where a SQL Server 2016 instance seemed to be struggling with checkpoint times.

PostgreSQL vs. Oracle: Difference in Costs, Ease of Use & Functionality


Oracle Database is a commercial, proprietary multi-model database management system produced by Oracle Corporation, and the largest relational database management system (RDBMS) in the world. Compare ease of use across compatibility, extensions, tuning, operating systems, languages and support providers. PostgreSQL is an open source object-relational database system with over 30 years of active development. Supported Operating Systems.

AMD EPYC 7002 Series Processors and SQL Server

SQL Performance

This system has one AMD EPYC 7502P 32-core processor and 512GB of RAM. The price per QphH for this system is 0.34 That is a higher score, but this system used two Intel Xeon Platinum 8180 28-core processors (that had a total of 56C/112T) and 512GB of RAM. The price per QphH for this system is 0.47 This system has one AMD EPYC 7742 64-core processor and 1TB of RAM. This system has two, Intel Xeon Platinum 8280 28-core processors and 1.5TB of RAM.

Monitoring Self-Destructing Apps Using Prometheus


Prometheus is an open-source system monitoring and alerting toolkit. Data related to monitoring is stored in RAM and LevelDB nevertheless data can be stored to other storage systems such as ElasticSearch, InfluxDb, and others, [link]. Watch out for your self-destructing apps! php tutorial python performance faas prometheus monitoring and alerting

Distributed consensus revised – Part II

The Morning Paper

During the steady state, we can reach each decision in one round trip to the majority of acceptors and one synchronous write to persistent storage. In either case, proposals must begin with a synchronous write to storage (strictly, the write must be completed before phase two starts). The majority requirement can be generalised to use any quorum system which guarantees all quorums intersect. Quorum systems other than strict majority are rarely utilised in practice.