Deployment challenges with large enterprise systems

Dynatrace

For small deployments, it isn’t a problem however when scaling up to hundred or even thousands of systems things can become complicated. Even when all the systems are mapped correctly by Dynatrace, identifying these systems is a real challenge. S _ for the system.

Wireless attacks on aircraft instrument landing systems

The Morning Paper

Wireless attacks on aircraft instrument landing systems Sathaye et al., The first fully operational Instrument Landing System (ILS) for planes was deployed in 1932. Due to the low decision height, attacks on CAT IIII systems can have severe consequences.

Build automated self-healing systems with xMatters and Dynatrace (Part 2 of 3)

Dynatrace

In this alert, xMatters includes all the important incident information from Dynatrace, so there’s no need for you to visit additional system dashboards. Depending on the type of the issue, xMatters launches workflows across your systems to start the automated self-healing process.

Build automated self-healing systems with xMatters and Dynatrace (Part 1 of 3)

Dynatrace

In this three-part blog series, we’ll share the following three common problem scenarios that you can easily solve by building an automated self-healing system with Dynatrace and xMatters Flow Designer: Process crash. Depending on the type of Dynatrace issue, xMatters prompts on-call resources with response option buttons that launch workflows across your systems to start the automated self-healing process—and to keep stakeholders and customers updated. Dynatrace news.

Towards federated learning at scale: system design

The Morning Paper

Towards federated learning at scale: system design Bonawitz et al., This is a high level paper describing Google’s production system for federated learning. The FL system overall comprises a set of devices (e.g., SysML 2019.

MySQL Memory Management, Memory Allocators, and Operating System

DZone

performance mysql memory operating system bug memory management memory allocatorsWhen users experience memory usage issues with any software, including MySQL, their first response is to think that it’s a symptom of a memory leak. As this story will show, this is not always the case. This story is about a bug.

Teaching rigorous distributed systems with efficient model checking

The Morning Paper

Teaching rigorous distributed systems with efficient model checking Michael et al., It describes the labs environment, DSLabs , developed at the University of Washington to accompany a course in distributed systems. A visual debugger/system explorer.

An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems

The Morning Paper

An open-source benchmark suite for microservices and their hardware-software implications for cloud & edge systems Gan et al., In this paper we explore the implications microservices have across the cloud system stack. Operating system and network implications.

Checksums in Storage Systems and Why the Enterprise Should Care

DZone

Let’s assume for a moment that your data survives its many passes through a system’s DRAM and emerges intact. That data must then be safely transported over a network to the storage system where it is written to disk. Random bit flips are far more common than most people, even IT professionals, think. Surprisingly, the problem isn’t widely discussed, even though it is silently causing data corruption that can directly impact our jobs, our businesses, and our security.

50 ways to leak your data: an exploration of apps’ circumvention of the Android permissions system

The Morning Paper

50 ways to leak your data: an exploration of apps’ circumvention of the Android permissions system Reardon et al., Side-channels are typically an unintentional consequence of a complicated system.

BPF Performance Tools: Linux System and Application Observability (book)

Brendan Gregg

BPF (eBPF) tracing is a superpower that can analyze everything, and I'll show you how in my upcoming book BPF Performance Tools: Linux System and Application Observability , coming soon from Addison Wesley. A time where you can pose arbitrary questions of the system, and it can answer them.

Fine-grained, secure and efficient data provenance on blockchain systems

The Morning Paper

Fine-grained, secure and efficient data provenance on blockchain systems Ruan et al., That’s hard to do in today’s blockchain systems for two reasons: Provenance can only be determined by querying and replaying all on-chain transactions, which is inefficient and an offline activity.

Software Systems Will Fail

Professor Beekums

Gitlab had a very public outage last month. Most companies provide some kind of explanation when their services are interrupted. Those are usually sanitized (or seem sanitized) to make things seem better than they actually are.

Three Other Models of Computer System Performance: Part 1

ACM Sigarch

Computer systems, from the Internet-of-Things devices to datacenters, are complex and optimizing them can enhance capability and save money. Developing simulators, however, is time-consuming and requires a great deal of infrastructure development regarding a prospective system.

Partitioned Hive Table Across Storage Systems Using Alluxio

DZone

This is where Alluxio comes in and interfaces with applications like Hive as a distributed virtual file system to create tables with multiple partitionings in a different storage system. In this regard, data will always reside in the under-storage system as the source of truth and can be residing temporarily in the Alluxio file system.

PyTorch-BigGraph: a large-scale graph embedding system

The Morning Paper

PyTorch-BigGraph: a large-scale graph embedding system Lerer et al., SysML’19. We looked at graph neural networks earlier this year, which operate directly over a graph structure.

Migrating Functionality Between Large-scale Production Systems Seamlessly

Uber Engineering

As we scaled up to our present level of support for 14 million trips per day, the car in that … The post Migrating Functionality Between Large-scale Production Systems Seamlessly appeared first on Uber Engineering Blog.

In Defense of Humanity—How Complex Systems Failed in Westworld **spoilers**

High Scalability

The reason is in How Complex Systems Fail. How Complex Systems Fail The Westworld season finale made an interesting claim: humans are so simple and predictable they can be encoded by a 10,247-line algorithm. Small enough to fit in the pages of a thin virtual book.

Scaling Uber’s Apache Hadoop Distributed File System for Growth

Uber Engineering

This analysis powers our services and enables the delivery of more seamless and reliable user … The post Scaling Uber’s Apache Hadoop Distributed File System for Growth appeared first on Uber Engineering Blog. Three years ago, Uber Engineering adopted Hadoop as the storage ( HDFS ) and compute ( YARN ) infrastructure for our organization’s big data analysis.

Three Other Models of Computer System Performance: Part 2

ACM Sigarch

The M/M/1 queue also assumes that the arrival rate is not affected by the unbounded number of tasks in the queue (called an “open system”). With two blog posts, we argue for more use of simple models beyond Amdahl’s Law.

Unlocking Enterprise systems using voice

All Things Distributed

The interfaces to our digital system have been dictated by the capabilities of our computer systems—keyboards, mice, graphical interfaces, remotes, and touch screens. As a result, they fail to deliver a truly seamless and customer-centric experience that integrates our digital systems into our analog lives. All of these benefits make voice a game changer for interacting with all kinds of digital systems.

Approaches to System Security: Using Cryptographic Techniques to Minimize Trust

ACM Sigarch

This is the first post in a series of posts on different approaches to systems security especially as they apply to hardware and architectural security. In this post, we will consider the use of mathematics/cryptography as an approach to improving systems security.

Who monitors the monitoring systems?

Adrian Cockcroft

In reality, in any non-trivial installation, there are multiple tools collecting, storing and displaying overlapping sets of metrics from many types of systems and different levels of abstraction. What if your monitoring systems fail? “Quis custodiet ipsos custodes?”?—?Juvenal

2019 Database Trends – SQL vs. NoSQL, Top Databases, Single vs. Multiple Database Use

Scalegrid

Get the latest insights on MySQL , MongoDB , PostgreSQL , Redis , and many others to see which database management systems are most favored this year. Based on our findings, SQL still holds 60% with rising demand for systems such as PostgreSQL: SQL Database Use: 60.48%.

EuroBSDcon: System Performance Analysis Methodologies

Brendan Gregg

In the past I've shared similar methodologies applied to other operating systems, and finished porting them to BSD for this talk. For my first trip to Paris I gave the closing keynote at [EuroBSDcon 2017] on performance methodologies, using FreeBSD 11.1 as an analysis target.

Ginseng: keeping secrets in registers when you distrust the operating system

The Morning Paper

Ginseng: keeping secrets in registers when you distrust the operating system Yun & Zhong et al., Suppose you did go to the extreme length of establishing an unconditional root of trust for your system, even then, unless every subsequent piece of code you load is also fully trusted (e.g.,

Amazon Aurora development team wins the 2019 ACM SIGMOD Systems Award

All Things Distributed

This week, the developers of Amazon Aurora have won the 2019 Association for Computing Machinery's (ACM) Special Interest Group on Management of Data (SIGMOD) Systems Award.

Third-order effects and software systems

Particular Software

At the height of the Cold War, the United States passed the Federal Aid Highway Act of 1956, giving birth to the Interstate Highway System. They can be observed in our software systems as well. What if the system was built to allow junior developers to actively participate?

Monitoring SQL Server deadlocks using the system_health extended event

SQL Shack

Performance monitoring is a must to do the task for a DBA. You should ensure that the database performance is optimal all the time without any impact on the databases. Performance issues act like an open stage, and you need to look at every aspect such as CPU, RAM, server performance, database performance, indexes, blocking, […]. Deadlocks

Build automated self-healing systems with xMatters and Dynatrace (Part 3 of 3)

Dynatrace

The alert comes with the full context of the issue, including errors caused, impacted systems, and level of severity. Depending on the type of the issue, xMatters launches workflows across your systems to start the automated self-healing process. Dynatrace news.

PostgreSQL Connection Pooling: Part 1 – Pros & Cons

Scalegrid

On modern Linux systems, the difference in overhead between forking a process and creating a thread is much lesser than it used to be. A long time ago, in a galaxy far far away, ‘threads’ were a programming novelty rarely used and seldom trusted.

Delta: A Data Synchronization and Enrichment Platform

The Netflix TechBlog

Another issue exists for the capture of schema changes, where some systems, like MySQL, don’t support transactional schema changes [1][2]. By their nature, they can only rely on the lowest common denominator of participating systems.

Efficient Enterprise Testing — Integration Tests (Part Three)

DZone

This part of the series will show how to verify our applications with code-level as well as system-level integration tests. performance junit integration testing system testing enterprise testingEfficiency is everything!

The challenges of monitoring a distributed system

Particular Software

I remember the first time I deployed a system into production. Once the system was deployed, I wanted to see if everything was working properly, so I ran through a simple checklist: Is my database up? Yes/No) If the answers to these questions were all yes, then the system was working correctly. If the answer to any of those questions was no, then the system wasn't working correctly and I needed to take action to correct it.

Intro to Redis Cluster Sharding – Advantages, Limitations, Deploying & Client Connections

High Scalability

Redis Cluster is the native sharding implementation available within Redis that allows you to automatically distribute your data across multiple nodes without having to rely on external tools and utilities. At ScaleGrid, we recently added support for Redis Clusters on our platform through our fully managed Redis hosting plans.

The Challenges and Traps of Architecting Sociotechnical Systems

Strategic Tech

There is a high cost associated with work that leaves your team… team boundaries and software boundaries should be isomorphic” — James Lewis, Thoughtworks I’ve written and spoken a lot about architecting sociotechnical systems and how to find boundaries.

Updated Lampson's Hints for Computer Systems Design

All Things Distributed

Instead I have a video of a wonderful presentation by Butler Lampson where he talks about the learnings of the past decades that helped him to update his excellent 1983 " Hints for computer system design ".

Maximizing fun (and profit) in your distributed systems

Particular Software

Based on our experience running business systems in production, we know we need to monitor our theme park to make sure it's working properly. How many CPU cycles is a system using? Infrastructure monitoring tools generally treat systems as "black boxes" that consume resources.

Software-defined far memory in warehouse scale computers

The Morning Paper

” This paper describes a “far memory” system that has been in production deployment at Google since 2016. The objective is to find the lowest cold age threshold that still allows the system to satisfy its performance constraints.

Evolution of Netflix Conductor:

The Netflix TechBlog

External Payload Storage External payload storage was implemented to prevent the usage of Conductor as a data persistence system and to reduce the pressure on its backend datastore. The workflow status listener provides hooks to connect to any notification system of your choice.

Lambda 213

MezzFS?—?Mounting object storage in Netflix’s media processing platform

The Netflix TechBlog

Mounting object storage in Netflix’s media processing platform By Barak Alon (on behalf of Netflix’s Media Cloud Engineering team) MezzFS (short for “Mezzanine File System”) is a tool we’ve developed at Netflix that mounts cloud objects as local files via FUSE. MezzFS?—?Mounting

Media 279

Back-to-Basics Weekend Reading: An Implementation of a Log-Structured File System

All Things Distributed

One topic that always gets me excited is how to take computer science research and implement it in production systems. real systems do not fail by stopping in a nice and clean way). This weekend I am travelling to Australia for the first AWS Summit of 2017.

Back-to-Basics Weekend Reading - Virtualizing Operating Systems.

All Things Distributed

Werner Vogels weblog on building scalable and robust distributed systems. Back-to-Basics Weekend Reading - Virtualizing Operating Systems. This weekends back-to-basics reading is on operating system virtualization. There are two papers that deserve the "classic" tag as they both form the basis for operating system virtualization that is in production today. All Things Distributed. By Werner Vogels on 20 July 2012 12:00 PM. Permalink. Comments ().