Data Mining Problems in Retail

Highly Scalable

Retail is one of the most important business domains for data science and data mining applications because of its prolific data and numerous optimization problems such as optimal prices, discounts, recommendations, and stock levels that can be solved using data analysis methods.

Retail 175

Taking Let's Encrypt for a Spin

Tim Kadlec

A lot of folks have been very vocally pushing for “HTTPS Everywhere”, and for good reason. The fact that the lack of HTTPS makes you miss out on shiny new things like HTTP/2 and Service Workers adds even more incentive for those a little less inspired by the security arguments.

Mark+Steve, Performance+Design

Speed Curve

I'm excited to announce that I've joined SpeedCurve! When SpeedCurve was just a twinkle in Mark's eye, he contacted me about the concept and I encouraged him that a commercial version of WebPageTest was needed. When I saw the early versions of SpeedCurve, I was blown away.

Interpreter, Compiler, JIT

Nick Desaulniers

Interpreters and compilers are interesting programs, themselves used to run or translate other programs, respectively. Those other programs that might be interpreted might be languages like JavaScript, Ruby, Python, PHP, and Perl.

C++ 81

London Calling! An AWS Region is coming to the UK!

All Things Distributed

Yesterday, AWS evangelist Jeff Barr wrote that AWS will be opening a region in South Korea in early 2016 that will be our 5th region in Asia Pacific.

Retail 101

Progressive Web Apps: Escaping Tabs Without Losing Our Soul

Alex Russell

It happens on the web from time to time that powerful technologies come to exist without the benefit of marketing departments or slick packaging. They linger and grow at the peripheries, becoming old-hat to a tiny group while remaining nearly invisible to everyone else. Until someone names them.

Agile Software Development

Professor Beekums

It’d be hard to be a software developer these days without hearing about “being agile” Agile is a popular software development process. It is intentionally loosely defined, though that naturally leads to many many different opinions about what it is.

More Trending

Scaling Redis and Memcached at Wayfair

Wayfair Tech

I wrote a post last year on consistent hashing for Redis and Memcached with ketama: [link] We've evolved our system a lot since then, and I gave a talk about the latest developments at Facebook's excellent Data@Scale Boston conference in November: [link] We have some updates to both design and.

Service-Oriented Architecture: Scaling the Uber Engineering Codebase As We Grow

Uber Engineering

Like many startups, Uber began its journey with a monolithic architecture, built for a single offering in a single city. At the time, all of Uber was our UberBLACK option and our “world” was San Francisco.

Corporate Middle Management as an Autopoietic System

The Agile Manager

[T]he aim of such systems is ultimately to produce themselves: their own organization and identity is their most important product. -- Gareth Morgan, Images of Organization , p. In the early 1970s, biologists Humberto Maturana and Francisco Varela coined the term autopoiesis to define the self-maintaining nature of living cells: biological cells produce the components that maintain the structure that creates more components (in this case, more cells).

Posts from Dr. Dobb’s Journal

Allen Holub

I wrote for DDJ (may it rest in peace) for many many years. Towards the end, I wrote a blog on agile-related topics. I haven’t gotten around to moving the actual articles over here, but here are links to them in the DDJ archives: Agile Certifications Are Actively Destructive Endless Flexibility, The Enemy of Agile The Anti… Agility


Holiday Web Reading

Tim Kadlec

I enjoy reading and one of the rules of all well-behaved reading enthusiasts—much like vegans, cross fitters and people who eat gluten free—is to never stop telling everyone we know (and even some people we don’t know) about it.

Critical Blocking Resources

Speed Curve

At SpeedCurve, we focus on metrics that capture the user experience. A big part of the user experience is when content actually appears in front of the user.

Hidden in Plain Sight - Public Key Crypto

Nick Desaulniers

How is it possible for us to communicate securely when there’s the possibility of a third party eavesdropping on us? How can we communicate private secrets through public channels?

Under the Hood of Amazon EC2 Container Service

All Things Distributed

In my last post about Amazon EC2 Container Service (Amazon ECS), I discussed the two key components of running modern distributed applications on a cluster: reliable state management and flexible scheduling.

Doing Science On The Web

Alex Russell

Cross-posted at Medium. This post is about vendor prefixes, why they didn’t work, and why it’s toxic not to be able to launch experimental features. Also, what to do about it. Vendor prefixes are a very sore topic , and one where I’ve disagreed with the overwhelming consensus.

Advice For Becoming a Front End Developer

Professor Beekums

Someone recently asked me for advice for switching careers to be a front end developer. I knew very little about the person other than their college degree was unrelated to the field, they were trying out Free Code Camp, and that they wanted to be a front end developer.

EveryStep Scripting Tool: Advanced Features


The EveryStep Scripting Tool by Dotcom-Monitor is a powerful macro that records scripts to perform automated monitoring of your websites' performance.

Tungsten in the news

Wayfair Tech

There's a great interview with our own Matt DeGennaro by Paul Krill of Infoworld that came out a few days ago. The topic is Tungsten.js, our awesome framework that 'lights up' the DOM with fast, virtual-DOM-based updates, React-style, and can be integrated with Backbone.js and pretty much whatever other framework. Read more. Open Source Web Performance mustache php tungsten.js

Faster Mobile Websites - Slides

Dean Hume

Earlier this year I was lucky enough to get the chance to present at UpFront Conference in Manchester. This was the inaugural year for the conference, and it was great to be apart of this event. A few people have asked about the slide deck and wanted to know more.


The Agile Manager

Earlier this year, my house should have burned to the ground. A CR2032 battery exploded and caught fire in a confined place dense with flammable objects. But my house didn't burn down: at the moment the battery exploded, I was sitting a few feet away from it. I heard a loud bang, investigated, and stamped out the fire within a few seconds. I wasn't planning to be there at the time.

Visual diffs on every deploy

Speed Curve

SpeedCurve now provides a visual diff of every deploy. A full resolution PNG is captured for each URL and each pixel is diffed with the previous deploy allowing you to easily spot any visual changes you may or may not have expected.

Additional C/C++ Tooling

Nick Desaulniers

21st Century C by Ben Klemens. was a great read. It had a section with an intro to autotools, git, and gdb. There are a few other useful tools that came to mind that I’ve used when working with C and C++ codebases. These tools are a great way to start contributing to Open Source. C & C++ codebases; running these tools on the code or adding them to the codebases. A lot of these favor command line, open source utilities. See how many you are familiar with! Build Tools. CMake.

C++ 63

AMP and Incentives

Tim Kadlec

Incentives are fascinating. Dangle the right carrot in front of people and you can subtly influence their behavior. But it has to be the right carrot. It has to matter to the people you’re trying to influence. Just as importantly, it has to influence the correct changes. A few years ago there was a story of incentives gone wrong that was making the rounds. The story was about a fast food chain that determined customer service was an important metric that they needed to track in some way.

Cache 63

Understanding Proxy Browsers: Architecture

Tim Kadlec

I did a bunch of research on proxy-browsers for a few projects I worked on. Rather than sitting on it all, I figured I’d write a series of posts sharing what I learned in case it’s helpful to anyone else. This first post looks at the general architecture of proxy browsers with a performance focus. In the original story of the Wizard of Oz, the Emerald City isn’t actually green nor made entirely of emeralds. All of that came later.

Embrace event-driven computing: Amazon expands DynamoDB with streams, cross-region replication, and database triggers

All Things Distributed

In just three short years, Amazon DynamoDB has emerged as the backbone for many powerful Internet applications such as AdRoll , Druva , DeviceScape , and Battlecamp. Many happy developers are using DynamoDB to handle trillions of requests every day.

Titan Graph Database Integration with DynamoDB: World-class Performance, Availability, and Scale for New Workloads

All Things Distributed

Today, we are releasing a plugin that allows customers to use the Titan graph engine with Amazon DynamoDB as the backend storage layer. It opens up the possibility to enjoy the value that graph databases bring to relationship-centric use cases, without worrying about managing the underlying storage.

User Timing and Custom Metrics

Speed Curve

If you want to improve performance, you must start by measuring performance. But what should you measure? Across the performance industry, the metric that's used the most is "page load time" (i.e, "window.onload" or "document complete"). Page load time was pretty good at approximating the user experience in the days of Web 1.0 when pages were simpler and each user action loaded a new web page (multi-page websites). In the days of Web 2.0

Amazon announces the Alexa Skills Kit, Enabling Developers to Create New Voice Capabilities

All Things Distributed

Today, Amazon announced the Alexa Skills Kit (ASK) , a collection of self-service APIs and tools that make it fast and easy for developers to create new voice-driven capabilities for Alexa.

Thriving in Unpredictability

Tim Kadlec

Getting a website successfully delivered to a visitor depends on a series of actions. My server must spit something out. That something must be passed over some network. That something must then be consumed by another something: some client (often a browser) on some device. Finally, the visitor views that something in whatever context they happen to be in. There are a lot of unpredictable layers here. I have no control over the network.

Expanding the Cloud: Introducing Amazon QuickSight

All Things Distributed

We live in a world where massive volumes of data are generated from websites, connected devices and mobile apps.

Cloud 87

Access Optional

Tim Kadlec

I remember going as a kid with my parents when they would pick out a new car. My parents didn’t want to spend a ton so we usually looked for something basic that would work. The car, of course, had to have certain features. A way to steer. Brakes. An engine. Doors. These were things all cars had and all cars had to have if anyone was going to ever consider purchasing them. From there you decided on the bells and whistles. Did you want power windows and power locks?

Joining Akamai

Tim Kadlec

On May 11th, I’ll be joining Akamai. I would be lying if I said it was an easy decision. I waffled a lot (For the sports enthusiasts out there, it’s not entirely unlike Favre and retirement. For the rest of you, insert some clever Waffle House pun here.). The past few years of working for myself have been amazing! I’ve gotten to work on some great projects with some great people and have had a ton of fun doing it.

Back-to-Basics Weekend Reading - Machine Learning

All Things Distributed

Machine learning is a scientific discipline that explores the construction and study of algorithms that can learn from data. Such algorithms operate by building a model from example inputs and using that to make predictions or decisions, rather than following strictly static program instructions.

Back-to-Basics Weekend Reading - Survey of Local Algorithms

All Things Distributed

As we know the run time of most algorithms increases when the input set increases in size. There is one noticeable exception: there is a class of distributed algorithms, dubbed local algorithms, that run in constant time, independently of the size of the network.

Observations on the Importance of Cloud-based Analytics

All Things Distributed

Cloud computing is enabling amazing new innovations both in consumer and enterprise products, as it became the new normal for organizations of all sizes. So many exciting new areas are being empowered by cloud that it is fascinating to watch.

Back-to-Basics Weekend Reading - Distributed Snapshots: Determining Global States of a Distributed System

All Things Distributed

Several problems in Distributed Systems can be seen as the challenge to determine a global state.