Master over 25 technologies that comprise the big data ecosystem with Ultimate Big Data Application Development.
About This Book
Understand the fundamentals of Big Data application development
Covers a huge suite of over 25 core technologies, and shows you how they all fit together.
Comprehensive tutorial packed with practical examples to help you develop real-world Big Data applications
Who This Book Is For
Software engineers and programmers who want to understand the larger Hadoop ecosystem, and use it to store, analyze, and vend "big data" at scale. Project, program, or product managers who want to understand the lingo and high-level architecture of Hadoop. Data analysts and database administrators who are curious about Hadoop and how it relates to their work. System architects who need to understand the components available in the Hadoop ecosystem, and how they fit together.
What You Will Learn
Design distributed systems that manage "big data" using Hadoop and related technologies.
Use HDFS and MapReduce for storing and analyzing data at scale.
Use Pig and Spark to create scripts to process data on a Hadoop cluster in more complex ways.
Analyze relational data using Hive and MySQL
Analyze non-relational data using HBase, Cassandra, and MongoDB
Query data interactively with Drill, Phoenix, and Presto
Understand how Hadoop clusters are managed by YARN, Tez, Mesos, Zookeeper, Zeppelin, Hue, and Oozie.
Publish data to your Hadoop cluster using Kafka, Sqoop, and Flume
Consume streaming data using Spark Streaming, Flink, and Storm
In Detail
Learn and master the most popular big data technologies in this comprehensive book, written by a former engineer and senior manager from Amazon and IMDb. Ultimate Big Data Application Development is a comprehensive book covering over 25 different technologies that comprise the big data ecosystem. It focuses on Hadoop as the core technology and also dives into surrounding technologies, such as Spark, Cassandra, MapReduce, MongoDB, Yarn, Kafka, and many, many more.
Understanding Hadoop is a highly valuable skill for anyone working at companies with large amounts of data. This book goes way beyond Hadoop itself, and covers all sorts of distributed systems you may need to integrate with. Almost every large company you might want to work at uses Hadoop in some way, including Amazon, eBay, Facebook, Google, LinkedIn, IBM, Spotify, Twitter, and Yahoo! And it's not just technology companies that need Hadoop; even the New York Times uses Hadoop for processing images.
This book is comprehensive, covering over 25 different technologies. It's filled with hands-on activities and exercises, so you get some real experience in processing big data with Hadoop - it's not just theory. You'll walk away from this course with a real, deep understanding of Hadoop and its associated distributed systems, and you can apply Hadoop to real-world problems.