HDP Developer: Java

Live Online & Classroom Certification Training

Master core concepts of Hadoop application development. Design and develop MapReduce applications for Hadoop using Hortonworks Data Platform. Be an expert to implement combiners, partitioners, secondary sorts, custom input and output formats, joining large datasets, unit testing, and developing UDFs for Pig and Hive.

(4.7) 57 Learners
Instructed by SPRINGPEOPLE
INDIA
  • 04
    Mar
    4 Days
    Bangalore, 04-Mar to 08-Mar (Sunday - Thursday), Classroom (10:30 PM Start) $1,012.38  Early Bird Offer: $934.44
  • 25
    Mar
    4 Days
    Bangalore, 25-Mar to 29-Mar (Sunday - Thursday), Classroom (10:30 PM Start) $1,012.38  Early Bird Offer: $934.44

Course Description

Overview

Gain end to end knowledge to optimize Mapreduce jobs and learn advanced Mapreduce features. Understand HDFS and Map aggregation as you learn with our certified instructors.

Learn how to write custom partitioner, custom input format, perform Map-side join, import data to Hbase and working with Pig and Hive programming.

Gain practical knowledge with our cloudlabs on configuration Hadoop development environment, Combining Input Files, Using Data Compression, Writing a Pig UDF, Writing a Pig Accumulator and defining an Oozie Workflow. 

Objective

At the end of HDP Developer: Java training, you will be able to:

  • Understand Hadoop, the Hadoop Distributed File System (HDFS) and Map Reduce
  • Practise comman HDFS Commands
  • Work on Open-Source YARN Use Cases
  • Learn in depth of Map Aggregation
  • Write Custom Partitioner
  • Create and Distribute a Partition File
  • Write a Group Comparator
  • Built-In Input Formats
  • Handle Records that Span Splits
  • Built-In Output Formats
  • Write a Custom Output Format
  • Optimize the Map and Reduce Phases
  • Configure of Data Compression 
  • Perform Joins in MapReduce
  • Set Up a Test
  • Test a Mapper
  • Test a Reducer
  • Learn the use of the Grunt Shell
  • Perform Queries
  • Wrtie a Hive UDF
Duration:
  • Classroom Training: 4 Days
  • Live Online Training:  24 Hours

 Suggested Audience:

  •  Experienced Java software engineers who need to develop Java MapReduce applications for Hadoop

 

Prerequisites

  • Experience in developing Java applications and using a Java IDE
  • No prior Hadoop knowledge is required.

Course Curriculum

Expand All
  • Describe Hadoop 2.X and the Hadoop Distribute File System
  • Describe the YARN framework
  • Describe the Purpose of NameNodes and Data Nodes
  • Describe the Purpose of HDFS High Availability (HA)
  • Describe the Purpose of the Quorum Journal Manager
  • List Common HDFS Commands
  • Describe the Purpose of YARN
  • List Open-Source YARN Use Cases
  • List the Components of YARN
  • Describe the Life Cycle of a YARN Application
  • Define Map Aggregation
  • Describe the Purpose of Combiners
  • Describe the Purpose of In-Map Aggregation
  • Describe the Purpose of Counters
  • Describe the Purpose of User-Defined Counters
  • Understanding Block Storage
  • Configuring a Hadoop Development Environment
  • Putting Files in HDFS with Java
  • Understanding Map Reduce (Lab)
  • Word Count (Lab)
  • Distributed Grep (Lab)
  • Inverted Index (Lab)
  • Using a Combiner (Lab)
  • Computing an Average (Lab)
  • Describe the Purpose of a Partitioner
  • List the Steps for Writing a Custom Partitioner
  • Describe How to Create and Distribute a Partition File
  • Describe the Purpose of Sorting
  • Describe the Purpose of Custom Keys
  • Describe How to Write a Group Comparator
  • List the Built-In Input Formats
  • Describe the Purpose of Input Formats
  • Define a Record Reader
  • Describe How to Handle Records that Span Splits
  • List the Built-In Output Formats
  • Describe How to Write a Custom Output Format
  • Describe the Purpose of the MultipleOutputs Class
  • Writing a Custom Partitioner (Lab)
  • Using TotalOrderPartitioner (Lab)
  • Custom Sorting (Lab)
  • Demonstration: Combining Input Files (Lab)
  • Processing Multiple Inputs (Lab)
  • Writing a Custom Input Format (Lab)
  • Customizing Output (Lab)
  • Working with a Simple Moving Average (Lab)
  • List Optimization Best Practices
  • Describe How to Optimize the Map and Reduce Phases
  • Describe the Benefits of Data Compression
  • Describe the Limits of Data Compression
  • Describe the Configuration of Data Compression
  • Describe the Purpose of a RawComparator
  • Describe the Purpose of Localization
  • List Scenarios for Performing Joins in MapReduce
  • Describe the Purpose of the Bloom Filter
  • Describe the Purpose of MRUnit and the MRUnit API
  • Describe How to Set Up a Test
  • Describe How to Test a Mapper
  • Describe How to Test a Reducer
  • Describe the Purpose of HBase
  • Define the Differences Between a Relational Database and HBase
  • Describe the HBase Architecture
  • Demonstrate the Basics of HBase Programming
  • Describe an HBase MapReduce Applications
  • Using Data Compression (Lab)
  • Defining a RawComparator (Lab)
  • Performing a Map-Side Join (Lab)
  • Using a Bloom Filter (Lab)
  • Unit Testing a MapReduce Job (Lab)
  • Importing Data to HBase (Lab)
  • Creating an HBase Mapreduce Job (Lab)
  • Describe the Purpose of Apache Pig and Pig Latin
  • Demonstrate the Use of the Grunt Shell
  • List the Common Pig Data Types
  • Describe the Purpose of the FOREACH GENERATE Operator
  • Describe the Purpose of Pig User Defined Functions (UDFs)
  • Describe the Purpose of Filter Functions
  • Describe the Purpose of Accumulator UDFs
  • Describe the Purpose of Algebraic Functions
  • Describe the Purpose of Apache Hive
  • Describe the Differences Between Apache Hive and SQL
  • Describe Apache Hive Architecture
  • Describe How to Load Data Into Hive
  • Demonstrate How to Perform Queries
  • Describe the Purpose of Hive User Defined Functions (UDFs)
  • Write a Hive UDF
  • Describe the Purpose of HCatalog
  • Describe the Purpose of Apache Oozie
  • Describe How to Define an Oozie Workflow
  • Describe Pig and Hive Actions
  • Describe How to Define an Oozie Coordinator Job
  • Understanding Pig (Lab)
  • Writing a Pig UDF (Lab)
  • Writing a Pig Accumulator (Lab)
  • Writing a Apache Hive UDF (Lab)
  • Defining an Oozie Workflow (Lab)
  • Working with TF-IDF and the JobControl Class (Lab)

Certification

SpringPeople works with top industry experts to identify the leading certification bodies on different technologies - which are well respected in the industry and globally accepted as clear evidence of a professional’s “proven” expertise in the technology. As such, these certification are a high value-add to the CVs and can give a massive boost to professionals in their career/professional growth.

Our certification courses are fully aligned to these high-profile certification exams; at the end of the course, participants will have detailed knowledge, be eligible and be fully ready take up these certification exams and pass with flying colours.

 

SpringPeople is the official training partner of Hortonworks. More details of this official training can be found here

Resources

SpringPeople Corporate Learning Center

About the Instructor

Founded in 2009, SpringPeople is a global premier eLearning marketplace for Online Live, Instructor-led classes in the region. It is a certified training delivery partner of leading technology creators, namely Pivotal, Elastic, Lightbend, EMC, VMware, MuleSoft, RSA, and... Read More


Course Rating and Reviews

4.7

Average Rating
5 Stars
28
4 Stars
12
3 Stars
1
2 Stars
0
1 Star
0

SPRINGPEOPLE SpringPeople Trainer

Ashok Reddy

Course:
Instructor:
Course Material:
Class Experience:
It goog if we get real time scenarios for automation

SPRINGPEOPLE SpringPeople Trainer

Goutham

Course:
Instructor:
Course Material:
Class Experience:
Maybe you could set Scala as a prerequisite for this course and discuss more technical details of Spark like what happens under the hood

SPRINGPEOPLE SpringPeople Trainer

Vamshi Suram

Software Engineer 2
Intuit
Course:
Instructor:
Course Material:
Class Experience:
Content is good. Could have added additional links to good resources to proceed further.

This class is intended for participants with some prior exposure to the technology and are now looking to build up their expertise on the topic.

On successful completion of the course, participants will be eligible to sit of the related certification exam (see course overview). All participants receive a course completion certificate, demonstrating their expertise on the subject.

Total duration of the online, live instructor led sessions. Sessions are typically delivered as short lectures (2-hrs weekdays/3-hrs weekends) and detailed hands-on guidance.

Expected offline lab work hours that participants will need to complete and submit to the trainer, during and after the instructor-led online sessions.

  1. We are happy to refund full fee paid - no questions asked - should you feel that the training is not up to your expectations.
  2. Our dedicated team of expert training enablement advisors are available on email, phone and chat to assist you with your queries.
  3. All courseware, including session recordings, will always be available to access to you for future reference and rework.

Contact Us

+91-80-6567-9700 (BLR)

training@springpeople.com

Schedule a Call

Related Courses

Recently Viewed