HDP Developer: Java Training

Live Online & Classroom Enterprise Training

Master core concepts of Hadoop application development. Design and develop MapReduce applications for Hadoop using Hortonworks Data Platform. Be an expert to implement combiners, partitioners, secondary sorts, custom input and output formats, joining large datasets, unit testing, and developing UDFs for Pig and Hive.

Looking for a private batch ?

REQUEST A CALLBACK
Key Features
  • Lifetime Access

  • CloudLabs

  • 24x7 Support

  • Real-time code analysis and feedback

  • 100% Money Back Guarantee

PDP BG 1
SpringPeople Logo

What is HDP Developer: Java about?

Gain end to end knowledge to optimize Mapreduce jobs and learn advanced Mapreduce features. Understand HDFS and Map aggregation as you learn with our certified instructors.

Learn how to write custom partitioner, custom input format, perform Map-side join, import data to Hbase and working with Pig and Hive programming.

Gain practical knowledge with our cloudlabs on configuration Hadoop development environment, Combining Input Files, Using Data Compression, Writing a Pig UDF, Writing a Pig Accumulator and defining an Oozie Workflow. 

What are the objectives of HDP Developer: Java ?

At the end of HDP Developer: Java training, you will be able to:

  • Understand Hadoop, the Hadoop Distributed File System (HDFS) and Map Reduce
  • Practise comman HDFS Commands
  • Work on Open-Source YARN Use Cases
  • Learn in depth of Map Aggregation
  • Write Custom Partitioner
  • Create and Distribute a Partition File
  • Write a Group Comparator
  • Built-In Input Formats
  • Handle Records that Span Splits
  • Built-In Output Formats
  • Write a Custom Output Format
  • Optimize the Map and Reduce Phases
  • Configure of Data Compression 
  • Perform Joins in MapReduce
  • Set Up a Test
  • Test a Mapper
  • Test a Reducer
  • Learn the use of the Grunt Shell
  • Perform Queries
  • Wrtie a Hive UDF
Available Training Modes

Live Online Training

Classroom Training

 

PDP BG 2

Who is HDP Developer: Java for?

  • Anyone who wants to add HDP Developer: Java Training skills to their profile
  • Teams getting started on HDP Developer: Java Training projects
  • What are the prerequisites for HDP Developer: Java?

    • Experience in developing Java applications and using a Java IDE
    • No prior Hadoop knowledge is required.

    Course Outline

    • Understanding Hadoop, HDFS, and Map Reduce
      • Describe Hadoop 2.X and the Hadoop Distribute File System
      • Describe the YARN framework
      • Describe the Purpose of NameNodes and Data Nodes
      • Describe the Purpose of HDFS High Availability (HA)
      • Describe the Purpose of the Quorum Journal Manager
      • List Common HDFS Commands
      • Describe the Purpose of YARN
      • List Open-Source YARN Use Cases
      • List the Components of YARN
      • Describe the Life Cycle of a YARN Application
      • Define Map Aggregation
      • Describe the Purpose of Combiners
      • Describe the Purpose of In-Map Aggregation
      • Describe the Purpose of Counters
      • Describe the Purpose of User-Defined Counters
      • Understanding Block Storage
      • Configuring a Hadoop Development Environment
      • Putting Files in HDFS with Java
      • Understanding Map Reduce (Lab)
      • Word Count (Lab)
      • Distributed Grep (Lab)
      • Inverted Index (Lab)
      • Using a Combiner (Lab)
      • Computing an Average (Lab)
    • Partitioning, Sorting and Input\/Output Formats
      • Describe the Purpose of a Partitioner
      • List the Steps for Writing a Custom Partitioner
      • Describe How to Create and Distribute a Partition File
      • Describe the Purpose of Sorting
      • Describe the Purpose of Custom Keys
      • Describe How to Write a Group Comparator
      • List the Built-In Input Formats
      • Describe the Purpose of Input Formats
      • Define a Record Reader
      • Describe How to Handle Records that Span Splits
      • List the Built-In Output Formats
      • Describe How to Write a Custom Output Format
      • Describe the Purpose of the MultipleOutputs Class
      • Writing a Custom Partitioner (Lab)
      • Using TotalOrderPartitioner (Lab)
      • Custom Sorting (Lab)
      • Demonstration: Combining Input Files (Lab)
      • Processing Multiple Inputs (Lab)
      • Writing a Custom Input Format (Lab)
      • Customizing Output (Lab)
      • Working with a Simple Moving Average (Lab)
    • Optimizing MapReduce, Advanced MapReduce and Hbase
      • List Optimization Best Practices
      • Describe How to Optimize the Map and Reduce Phases
      • Describe the Benefits of Data Compression
      • Describe the Limits of Data Compression
      • Describe the Configuration of Data Compression
      • Describe the Purpose of a RawComparator
      • Describe the Purpose of Localization
      • List Scenarios for Performing Joins in MapReduce
      • Describe the Purpose of the Bloom Filter
      • Describe the Purpose of MRUnit and the MRUnit API
      • Describe How to Set Up a Test
      • Describe How to Test a Mapper
      • Describe How to Test a Reducer
      • Describe the Purpose of HBase
      • Define the Differences Between a Relational Database and HBase
      • Describe the HBase Architecture
      • Demonstrate the Basics of HBase Programming
      • Describe an HBase MapReduce Applications
      • Using Data Compression (Lab)
      • Defining a RawComparator (Lab)
      • Performing a Map-Side Join (Lab)
      • Using a Bloom Filter (Lab)
      • Unit Testing a MapReduce Job (Lab)
      • Importing Data to HBase (Lab)
      • Creating an HBase Mapreduce Job (Lab)
    • Pig and Hive Programming, Defining Workflows
      • Describe the Purpose of Apache Pig and Pig Latin
      • Demonstrate the Use of the Grunt Shell
      • List the Common Pig Data Types
      • Describe the Purpose of the FOREACH GENERATE Operator
      • Describe the Purpose of Pig User Defined Functions (UDFs)
      • Describe the Purpose of Filter Functions
      • Describe the Purpose of Accumulator UDFs
      • Describe the Purpose of Algebraic Functions
      • Describe the Purpose of Apache Hive
      • Describe the Differences Between Apache Hive and SQL
      • Describe Apache Hive Architecture
      • Describe How to Load Data Into Hive
      • Demonstrate How to Perform Queries
      • Describe the Purpose of Hive User Defined Functions (UDFs)
      • Write a Hive UDF
      • Describe the Purpose of HCatalog
      • Describe the Purpose of Apache Oozie
      • Describe How to Define an Oozie Workflow
      • Describe Pig and Hive Actions
      • Describe How to Define an Oozie Coordinator Job
      • Understanding Pig (Lab)
      • Writing a Pig UDF (Lab)
      • Writing a Pig Accumulator (Lab)
      • Writing a Apache Hive UDF (Lab)
      • Defining an Oozie Workflow (Lab)
      • Working with TF-IDF and the JobControl Class (Lab)

    Who is the instructor for this training?

    The trainer for this HDP Developer: Java Training has extensive experience in this domain, including years of experience training & mentoring professionals.

    Reviews