article thumbnail

Write Optimized Spark Code for Big Data Applications

DZone

Apache Spark is a powerful open-source distributed computing framework that provides a variety of APIs to support big data processing. In addition, pySpark applications can be tuned to optimize performance and achieve better execution time, scalability, and resource utilization.

Big Data 161
article thumbnail

3 Performance Tricks for Dealing With Big Data Sets

DZone

This article describes 3 different tricks that I used in dealing with big data sets (order of 10 million records) and that proved to enhance performance dramatically. Trick 1: CLOB Instead of Result Set.

Big Data 246
Insiders

Sign Up for our Newsletter

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.

Trending Sources

article thumbnail

Turbocharge Your Apache Spark Jobs for Unmatched Performance

DZone

Apache Spark is a leading platform in the field of big data processing, known for its speed, versatility, and ease of use. However, getting the most out of Spark often involves fine-tuning and optimization. Understanding Apache Spark Apache Spark is a unified computing engine designed for large-scale data processing.

Big Data 130
article thumbnail

Evolving from Rule-based Classifier: Machine Learning Powered Auto Remediation in Netflix Data…

The Netflix TechBlog

Operational automation–including but not limited to, auto diagnosis, auto remediation, auto configuration, auto tuning, auto scaling, auto debugging, and auto testing–is key to the success of modern data platforms. We have also noted a great potential for further improvement by model tuning (see the section of Rollout in Production).

Tuning 210
article thumbnail

What is IT automation?

Dynatrace

Expect to spend time fine-tuning automation scripts as you find the right balance between automated and manual processing. This kind of automation can support key IT operations, such as infrastructure, digital processes, business processes, and big-data automation. Big data automation tools.

article thumbnail

Auto-Diagnosis and Remediation in Netflix Data Platform

The Netflix TechBlog

This blog will explore these two systems and how they perform auto-diagnosis and remediation across our Big Data Platform and Real-time infrastructure. One example where it can dramatically help is Spark jobs, where memory tuning is a significant challenge. Expand Pensive with Machine Learning classifiers.

Big Data 238
article thumbnail

Migrating Critical Traffic At Scale with No Downtime?—?Part 1

The Netflix TechBlog

Migrating Critical Traffic At Scale with No Downtime — Part 1 Shyam Gala , Javier Fernandez-Ivern , Anup Rokkam Pratap , Devang Shah Hundreds of millions of customers tune into Netflix every day, expecting an uninterrupted and immersive streaming experience. This technique facilitates validation on multiple fronts.

Traffic 339