Using Kudu with Apache Spark and Apache Flume [Online Code]
Using Kudu with Apache Spark and Apache Flume [Online Code] by O'Reilly Media at ITHKS. Hurry! Limited time offer. Offer valid only while supplies last. Number of Videos: 0.5 hours - 6 lessonsAuthor:Ryan BosshartUser Level:IntermediateApache Kudu, the breakthrough storage technology, is often used in
Apache Kudu, the breakthrough storage technology, is often used in conjunction with other Hadoop ecosystem frameworks for data ingest, processing, and analysis. This is a practical, hands-on course that shows you how Kudu works with four of those frameworks: Apache Spark, Spark SQL, MLlib, and Apache Flume.
You'll use the Kudu-Spark module with Spark and SparkSQL to seamlessly create, move, and update data between Kudu and Spark; then use Apache Flume to stream events into a Kudu table, and finally, query it using Apache Impala. The course is designed for learners with some limited experience using Hadoop ecosystem components like HDFS, Hive, Spark, or Impala.
- Get hands-on experience with Kudu and add more tools to your Big Data toolbox
- Learn how to move data between Kudu tables and Spark apps using the Kudu-Spark module
- Understand how to stream and analyze data in real-time with Flume and Kudu
- Create a movie ratings predictor using Flume and save the predicted values into Kudu
- See how these open source tools combine to create simple and fast data engineering pipelines
|Mac Minimum System Requirements:||Mac Recommended System Requirements:|
Features & Highlights
- Learn Using Kudu with Apache Spark and Apache Flume from a professional trainer on your own time at your own desk.
- This visual training method offers users increased retention and accelerated learning.
- Breaks even the most complex applications down into simplistic steps
- Comes with Extensive Working Files