Big Data Analytics with Hadoop and Spark online training
An online course designed by Subject matter experts to learn how to handle big data and do analytics using Hadoop and Spark.
This course is focused on demonstrating the challenges or limitations faced by traditional database systems and realizing how Hadoop and Spark can overcome the challenges and unleash the storage and processing capabilities.
Concepts and Tools covered
Apache Hadoop and Cloudera Hadoop distributions will be used HDFS Architecture, YARN, MapReduce, Pig, Hive, Hue, Impala, Scala, Spark RDD, Spark Streaming and Spark MlLib.
The Course content is handcrafted by Experts and it comprises of Presentation slides, Quizzes & Assignments for each Module, Class recording can be accessed in LMS. We would be using a lot of Industry specific use cases to make the learner job ready.
This course is recommended for Professionals from data management and engineering background, Professionals who work on BI, Reporting, Data warehousing & ETL.
Talk with Us for a customized content
Note: We also conduct Big Data Analytics with Hadoop and Spark classroom Training in Bangalore.
Introduction to Big Data and Big Data Technologies
Here we would discuss about Big Data and its hidden treasures
Understanding Hadoop and its components and why Hadoop
Here you will learn the 3 layers of Hadoop and how its different from other Big Data Technologies
Here you will learn Hadoop2.x cluster architecture, different cluster modes and also we will see configuration files
Map Reduce Classic and YARN
Here you will learn difference between Classic Map Reduce and YARN components. How Map Reduce works with different stages like Mappers, combiners, Partitioners, Counters, MAP side Joins, reduce side joins and Reducers
Analytics in PIG and HUE
Here you to get a good understanding of Pig Latin’s constructs. You will be learning to do common data operations with different Data models. Also we will see complex Data flows with practical matters of developing and testing your scripts
Analytics in Hive and HUE
Here you learn how to use Hive as Data warehousing platform on Hadoop data. People who know SQL can learn Hive easily. This section provides a comprehensive, example-driven introduction to HiveQL for all users, from developers, database administrators and architects, to less technical users, such as business analysts
HBase, Squoop and Flume
Here you will see Data Logistics, that is ways to manage moving large quantities of data into and out of Hadoop. Also see what are the limitations of MR stack and also the problems with RDBMS. What are the advantages of NoSQL Databases and from huge pool of NoSQL databases when you to use what based on their Data Model. HBase Architecture with Distributed components. We will see large number of examples to do CRUD operations to advance level of programming by using Filters
Introduction to Spark and Scala
This Module let you understand what is Spark, why it written in Scala. How Spark is becoming the most popular framework for BigData
Programming in Scala
Learn to code in Scala. Creating Variables, classes and Objects, Private variables, Collection data types like Arrays, Lists, Tuples and Dictionaries
SPARK SQL and RDD’s
Learn about Resilient Distribution Datasets and also learn to query using Spakr SQL
SPARK STREAMING and MlLib
Learn to stream data using Spark streaming and also master machine learning libraries that are present in SPARK MlLib. Demonstrate your skills by working on a proof of concept project at the end of the program
Proof of concept project
Working on Sample data set to demonstrate the techniques learned throughout the workshop
0.00 average base on 0 ratings