March, 2024

article thumbnail

The State of Observability 2024: Navigating Complexity With AI-Driven Insights


In today's fast-paced digital landscape, organizations are increasingly embracing multi-cloud environments and cloud-native architectures to drive innovation and deliver seamless customer experiences. However, the 2024 State of Observability report from Dynatrace reveals that the explosion of data generated by these complex ecosystems is pushing traditional monitoring and analytics approaches to their limits.

Analytics 331
article thumbnail

Enhance data collection with Dynatrace OpenTelemetry Collector distribution


As organizations strive for observability and data democratization, OpenTelemetry emerges as a key technology to create and transfer observability data. OpenTelemetry is gaining popularity because it’s considered a standard, and that’s why it’s a common choice for creating future-proof solutions for years to come. To answer the growing demand for OpenTelemetry, Dynatrace is proud to announce the release of the Dynatrace OpenTelemetry Collector distribution (Dynatrace OTel Collector).


Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

article thumbnail

Bending pause times to your will with Generational ZGC

The Netflix TechBlog

The surprising and not so surprising benefits of generations in the Z Garbage Collector. By Danny Thomas, JVM Ecosystem Team The latest long term support release of the JDK delivers generational support for the Z Garbage Collector. More than half of our critical streaming video services are now running on JDK 21 with Generational ZGC, so it’s a good time to talk about our experience and the benefits we’ve seen.

Latency 234
article thumbnail

Master MySQL Point in Time Recovery


Data loss or corruption can be daunting. With MySQL point-in-time recovery , you can restore your database to the moment before the problem occurs. This article delivers a practical roadmap for using backups and binary logs to achieve accurate MySQL recovery, detailed steps for setting up your server, and tips for managing recovery and backups effectively without overwhelming you with complexity.

Database 162
article thumbnail

Linux Crisis Tools

Brendan Gregg

When you have an outage caused by a performance issue, you don't want to lose precious time just to install the tools needed to diagnose it. Here is a list of "crisis tools" I recommend installing on your Linux servers by default (if they aren't already), along with the (Ubuntu) package names that they come from: Package Provides Notes procps ps(1), vmstat(8), uptime(1), top(1) basic stats util-linux dmesg(1), lsblk(1), lscpu(1) system log, device info sysstat iostat(1), mpstat

Servers 145
article thumbnail

District heating: Using data centers to heat communities

All Things Distributed

An inside look at the Tallaght District Heating Scheme, where Heat Works is using recycled heat from an AWS data center to warm a community in Dublin, Ireland.

AWS 134
article thumbnail

Getting Started With NCache Java Edition (Using Docker)


NCache Java Edition with distributed cache technique is a powerful tool that helps Java applications run faster, handle more users, and be more reliable. In today's world, where people expect apps to work quickly and without any problems, knowing how to use NCache Java Edition is very important. It's a key piece of technology for both developers and businesses who want to make sure their apps can give users fast access to data and a smooth experience.

Java 319

More Trending

article thumbnail

Supporting Diverse ML Systems at Netflix

The Netflix TechBlog

David J. Berg , Romain Cledat , Kayla Seeley , Shashank Srikanth , Chaoying Wang , Darin Yu Netflix uses data science and machine learning across all facets of the company, powering a wide range of business applications from our internal infrastructure and content demand modeling to media understanding. The Machine Learning Platform (MLP) team at Netflix provides an entire ecosystem of tools around Metaflow , an open source machine learning infrastructure framework we started, to empower data sc

Systems 233
article thumbnail

Redis vs Memcached in 2024


Choosing between Redis and Memcached hinges on specific application requirements. In this comparison of Redis vs Memcached, we strip away the complexity, focusing on each in-memory data store’s performance, scalability, and unique features. Discover which aligns better with your project’s needs without getting bogged down in technical jargon. Key Takeaways Redis offers complex data structures and additional features for versatile data handling, while Memcached excels in simplicity with a f

Cache 130
article thumbnail

The Return of the Frame Pointers

Brendan Gregg

Sometimes debuggers and profilers are obivously broken, sometimes it's subtle and hard to spot. From my flame graphs page: CPU flame graph (partly broken) (Click for original SVG.) This is pretty common and usually goes unnoticed as the flame graph looks ok at first glance. But there are 15% of samples on the left, above "[unknown]", that are in the wrong place and missing frames.

Java 144
article thumbnail

Uber Builds Scalable Chat Using Microservices with GraphQL Subscriptions and Kafka


Uber replaced a legacy architecture built using the WAMP protocol with a new solution that takes advantage of GraphQL subscriptions. The main drivers for creating a new architecture were challenges around reliability, scalability, observability/debugibility, as well as technical debt impeding the team’s ability to maintain the existing solution.

article thumbnail

Essential Techniques for Performance Tuning in Snowflake


Performance tuning in Snowflake is optimizing the configuration and SQL queries to improve the efficiency and speed of data operations. It involves adjusting various settings and writing queries to reduce execution time and resource consumption, ultimately leading to cost savings and enhanced user satisfaction.

Tuning 301
article thumbnail

Google Cloud Next 2024: AI innovation for Google Cloud


In today’s rapidly evolving landscape, incorporating AI innovation into business strategies is vital, enabling organizations to optimize operations, enhance decision-making processes, and stay competitive. The annual Google Cloud Next conference explores the latest innovations for cloud technology and Google Cloud. This year, Google’s event will take place from April 9 to 11 in Las Vegas.

Google 263
article thumbnail

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data Platform by Binbing Hou , Stephanie Vezich Tamayo , Xiao Chen , Liang Tian , Troy Ristow , Haoyuan Wang , Snehal Chennuru , Pawan Dixit This is the first of the series of our work at Netflix on leveraging data insights and Machine Learning (ML) to improve the operational automation around the performance and cost efficiency of big data jobs.

Tuning 215
article thumbnail

Plan Your Multi Cloud Strategy


Thinking about going multi-cloud? A well-planned multi cloud strategy can seriously upgrade your business’s tech game, making you more agile. Instead of being stuck with one cloud provider (and all its limitations), spreading out across several can make your operations more flexible, keep your services up and running smoothly, and even help you manage your costs better.

Strategy 130
article thumbnail

eBPF Documentary

Brendan Gregg

eBPF is a crazy technology – like putting JavaScript into the Linux kernel – and getting it accepted had so far been an untold story of strategy and ingenuity. The eBPF documentary, published late last year, tells this story by interviewing key players from 2014 including myself, and touches on new developments including Windows. (If you are new to eBPF, it is the name of a kernel execution engine that runs a variety of new programs in a performant and safe sandbox in the kernel, lik

article thumbnail

ChatGPT, Author of The Quixote


TL;DR LLMs and other GenAI models can reproduce significant chunks of training data. Specific prompts seem to “unlock” training data. We have many current and future copyright challenges: training may not infringe copyright, but legal doesn’t mean legitimate—we consider the analogy of MegaFace where surveillance models have been trained on photos of minors, for example, without informed consent.

article thumbnail

Machine Learning: A Revolutionizing Force in Cybersecurity


The cybersecurity landscape necessitates continual adaptation and exploration of novel defensive strategies to counter the evolving threats posed by malicious actors. Machine learning ( ML ) has emerged as a powerful tool for bolstering cybersecurity, offering innovative approaches to anomaly detection, intrusion prevention, and threat identification.

article thumbnail

Dynatrace OTel Collector distribution amplifies OpenTelemetry integration for scalable, production-ready observability


OpenTelemetry standardizes how organizations instrument, generate, and collect telemetry data for analysis and provides community-based support. Because of its flexibility, this open source approach to instrumenting and collecting telemetry data is becoming increasingly important in large-size organizations. But rigorous requirements for security, production readiness, scalability, and reliability can make adopting OpenTelemetry challenging for teams to maintain at enterprise scale.

article thumbnail

How Can I Take a Backup of Configuration Files in PostgreSQL?


PostgreSQL configuration file parameters are very important when managing a PostgreSQL database, and this blog post will discuss the importance of backing those files up. The following are the primary configuration files of the PostgreSQL database:postgresql.conf: One of the most important configuration files for the PostgreSQL database is postgresql.conf file.

Database 101
article thumbnail

Mastering Hybrid Cloud Strategy


Mastering Hybrid Cloud Strategy Are you looking to leverage the best private and public cloud worlds to propel your business forward? A hybrid cloud strategy could be your answer. This approach allows companies to combine the security and control of private clouds with public clouds’ scalability and innovation potential. This article will explore hybrid cloud benefits and steps to craft a plan that aligns with your unique business challenges.

Strategy 130
article thumbnail

What is System Testing? – Getting Started, Tips, and Tools


System testing involves analyzing the behavior and functionality of a fully integrated application. It is the third of the four levels of testing, performed after unit and integration testing but before user acceptance testing. A QA team member will usually do the assessing, or occasionally the task will fall to other team members such as product or project managers.

Systems 90
article thumbnail

Expedia Speeds up Flights Search with Micro Frontends and GraphQL Optimizations


Expedia made flight search faster by up to 52% (page usable time) by applying a range of optimizations to web and mobile applications. To support these improvements, the company improved the observability of its applications. Expedia Flights web application has been migrated to Micro Frontend Architecture (MFA) to allow flexibility, reusability, and better optimization.

Speed 85
article thumbnail

Organizing Knowledge With Knowledge Graphs: Industry Trends


Knowledge graphs are a giant web of information where elements and ideas are linked to show how they are related in the real world. This is beyond databases that just store information. Knowledge graphs also store the connections between information. This makes knowledge graphs very useful in various fields.

Database 290
article thumbnail

Business Flow: Why IT operations teams should monitor business processes


The business process observability challenge Increasingly dynamic business conditions demand business agility; reacting to a supply chain disruption and optimizing order fulfillment are simple but illustrative examples. Business agility requires real-time visibility into process health and performance, measured by business Key Performance Indicators (KPIs) that are shared between business stakeholders and the supporting IT operations teams.

article thumbnail

Managing Time Series Data Using TimeScaleDB-Powered PostgreSQL


PostgreSQL extensions are great! Simply by adding an extension, one transforms what is an otherwise vanilla general-purpose database management system into one capable of processing data requirements in a highly optimized fashion. Some extensions, like pg_repack, simplify and enhance existing features already, while other extensions, such as PostGIS and pgvector, add completely new capabilities.

Database 101
article thumbnail

What’s New at ScaleGrid – March 2024


ScaleGrid is thrilled to announce the latest updates across our platform, reflecting our commitment to performance, security, and usability. Our recent updates span several versions, introducing key improvements and bug fixes to ensure our clients’ databases run smoother, faster, and more securely. Updates Across the Board Improved Database Resilience and Security Our most recent updates have focused on improving database resilience and security across various platforms.

AWS 130
article thumbnail

Hello INP! Here's everything you need to know about the newest Core Web Vital

Speed Curve

After years of development and testing, Google has added Interaction to Next Paint (INP) to its trifecta of Core Web Vitals – the performance metrics that are a key ingredient in its search ranking algorithm. INP replaces First Input Delay (FID) as the Vitals responsiveness metric. Not sure what INP means or why it matters? No worries – that's what this post is for. :) What is INP?

Mobile 81
article thumbnail

Setting Up Your Environment for Kubernetes Operators Using Docker, kubectl, and k3d

Percona Community

If you are just starting out in the world of Kubernetes operators, like me, preparing the environment for their installation should be something we do with not much difficulty. This blog will quickly guide you in setting the minimal environment. Kubernetes operators are invaluable for automating complex database operations, tasks that Kubernetes does not handle directly.

article thumbnail

Effective Communication Strategies Between Microservices: Techniques and Real-World Examples


Building scalable systems using microservices architecture is a strategic approach to developing complex applications. Microservices allow teams to deploy and scale parts of their application independently, improving agility and reducing the complexity of updates and scaling. This step-by-step guide outlines the process of creating a microservices-based system, complete with detailed examples. 1.

Strategy 291
article thumbnail

Reinventing our Dynatrace Core Values


Values are the fabric of who we are, what we stand for as a company, and how we deliver what we promise to ourselves and our stakeholders. Studies show that purpose matters now more than ever to individuals, and core values serve as the key to creating a thriving business environment. Since taking on the role of Chief People Office at Dynatrace in the summer of 2022, my key focus has been developing and continually evolving an outstanding, globally cohesive employee experience to support and str

article thumbnail

Help Us Improve MySQL Usability and Double Win!


What makes a great user experience? There are probably as many answers to this question as there are users because we are talking about very subjective and personal feelings and observations.

article thumbnail

Hashnode Creates Scalable Feed Architecture on AWS with Step Functions, EventBridge and Redis


Hashnode created a scalable event-driven architecture (EDA) for composing feed data for thousands of users. The company used serverless services on AWS, including Lambda, Step Functions, EventBridge, and Redis Cache. The solution leverages Step Functions' distributed maps feature that enables high-concurrency processing.

article thumbnail

Navigate your way to better performance with prerendering and the bfcache

Speed Curve

I was inspired by Tim Vereecke's excellent talk on noise-cancelling RUM at PerfNow this past November. In this talk, he highlighted a lot of the 'noise' that comes along with capturing RUM data. Tim's approach was to filter out the noise introduced by really fast response times that can be caused by leveraging the browser cache, prerendering, and other performance optimization techniques.

article thumbnail

DataCentral: Uber’s Big Data Observability and Chargeback Platform

Uber Engineering

Discover real-time query analytics and governance with DataCentral: Uber’s big data observability powerhouse, tackling millions of queries in petabyte-scale environments.

article thumbnail

Time Data Series: Working With PHP Zmanim


This post continues my exploration of concepts and techniques related to both the way so-called “Jewish times” (zmanim) are calculated; as well as the techniques needed to use the PHP Zmanim library – a library of functions that let you easily calculate Jewish times. Once again I owe a huge debt of gratitude to several folks – including Eliyahu Hershfeld, creator of the Kosher Java library , Zachary Weixelbaum (owner of the PHP Zmanim library, a port of Kosher Java), Elyahu Jacobi (who built Roy

Java 285
article thumbnail

Overseeing SaaS security with AWS AppFabric and Dynatrace


Modern enterprises today use a myriad of enterprise Software-as-a-service (SaaS) applications and productivity suites to run business operations, such as Microsoft 365, Google Workspace, Salesforce, Slack, Zendesk, Zoom, GitHub, and many more. Overseeing SaaS security and monitoring audit logs across multiple SaaS applications is complex, which often involves building and maintaining dedicated integrations for each application that can retrieve audit logs.

AWS 246