Sat.Aug 17, 2013 - Fri.Aug 23, 2013

article thumbnail

In-Stream Big Data Processing

Highly Scalable

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. It became clear that real-time query processing and in-stream processing is the immediate need in many practical applications. In recent years, this idea got a lot of traction and a whole bunch of solutions like Twitter’s Storm, Yahoo’s S4, Cloudera’s Impala, Apache Spark, and Apache Tez appeared and joined the army of Big Data and NoSQL systems.

Big Data 154
article thumbnail

Back-to-the-Future Weekend Reading - Distributed GraphLab: A Framework for Machine Learning and Data Mining in the Cloud

All Things Distributed

'The intense travels around the world in the spring have kept me from keeping up on the historical reading that I would like to do, as such there have not been that many suggesting for the back-to-basics reading list. The fall is going be not that much different but I will make an effort to get back into a reading habit. I want to kick off the fall readings not with an historical paper but with two that detail GraphLab , an excellent framework for high performance machine learning that originall

Cloud 83