Here’s our roundup of articles relevant to Big Data Analytics published last week. Hope you find it useful – please don’t forget to subscribe!
Realtime Event Processing in Hadoop with NiFi, Kafka and Storm – Three Part Tutorial Series (HortonWorks)
Shows how geolocation information from trucks can be combined with sensor data from trucks and roads. Events generated by sensors will be ingested and routed by Apache NiFi, captured through a distributed publish-subscribe messaging system named Apache Kafka. Apache Storm is used to process this data from Kafka and eventually persist that data into HDFS and HBase.
Apache Spark: Analyzing Fantasy Sports (Part 2: Data Exploration) (Cloudera Blog)
Learn how analyzing stats from professional sports leagues is an instructive use case for data analytics using Apache Spark with SQL. Covers data exploration with Apache Impala (incubating) and Hue.
Machine Learning & Data Science Skills You Need To Get Hired In Fortune 500 Companies (Big Data Analytics Guide)
Must-have and nice-to-have skills Fortune 500 companies look for when hiring engineers to work on solutions requiring expertise in Machine Learning, Data Science, Big Data etc.
Distributed, Real-time Joins and Aggregations on User Activity Events using Kafka Streams (Confluent)
Shows how to enrich an incoming stream of events with side data, and then compute aggregations based on the enriched stream. Built on top of an end-to-end Hello World streaming application that analyzes Wikipedia real-time updates through a combination of Kafka Streams and Kafka Connect.
Apache Kafka on Heroku (Heroku Dev Center)
Apache Kafka on Heroku is an add-on that provides Kafka as a service with full integration into the Heroku platform.
Programming with R: Best Practices (Software Carpentry)
Define best formating practices when writing code in R scripts; Synthesize a consistent personal coding style to increase code readability, consistency, and repeatability; Apply this style to one’s own code.
Machine Learning Advice for Developers (The Practical Developer)
Crash Course in Convolutional Neural Networks for Machine Learning (Machine Learning Mastery)
Convolutional Neural Networks are a powerful artificial neural network technique. These networks preserve the spatial structure of the problem and were developed for object recognition tasks such as handwritten digit recognition.
Picking SQL or NoSQL? (Compose)
It’s a question that gets asked because often underlying it is another question – What’s broken in SQL databases that NoSQL databases fixes?
The Five Best Libraries For Building Data Visualizations (Fast Company)
D3, Vega, Processing, Gephi, Dygraphs.