Cloudera - Spark and Hadoop Developer Certification Training

Live Online & Classroom Certification Training

Excel yourself in importing data to your Apache Hadoop cluster and process it with Spark, Hive, Flume, Sqoop, Impala, and other Hadoop ecosystem tools with this Spark and Hadoop training and certification course.

(0.0)
Instructed by SPRINGPEOPLE
INDIA

No Public/Open-house class on the topic scheduled at the moment!

Course Description

Overview

This four-day hands-on training course delivers the key concepts. Participants of this course will learn to ingest and process data on a Hadoop cluster using the most up-to-date tools and techniques. Employing Hadoop ecosystem projects such as Spark, Hive, Flume, Sqoop and Impala, this training course prepares you for the real-world challenges as a Hadoop developer. They will also learn to identify the right tool to use in a given situation and will gain hands-on experience in developing using those tools.

Audience 
This course is designed for developers and engineers who have programming experience. 

Objective

After the completion of this course, you will be able to:

  • Understand Hadoop, HDFS, Hadoop architecture
  • Learn Hive, Sqoop, Impala
  • Learn Spark
  • Explore RDDs, Spark SQL etc.

Prerequisites

This course requires Java and Unix as prerequsites for learners.

Course Curriculum

Expand All
  • Challenge with Traditional Large-Scale Systems
  • Hadoop!
  • Data Storage and Ingest
  • Data Processing
  • Data Analysis and Exploration
  • Other Ecosystem Tools
  • Introduction to the Hands-On Exercises
  • Distributed Processing on a Cluster
  • Storage: HDFS Architecture
  • Storage: Using HDFS
  • Resource Management: YARN Architecture
  • Resource Management: Working with YARN
  • Sqoop Overview
  • Basic Imports and Exports
  • Limiting Results
  • Improving Sqoop’s Performance
  • Sqoop 2
  • Introduction to Impala and Hive
  • Why Use Impala and Hive?
  • Querying Data With Impala and Hive
  • Comparing Hive and Impala to Traditional Databases
  • Data Storage Overview
  • Creating Databases and Tables
  • Loading Data into Tables
  • HCatalog
  • Impala Metadata Caching
  • Selecting a File Format
  • Hadoop Tool Support for File Formats
  • Avro Schemas
  • Using Avro with Impala, Hive, and Sqoop
  • Avro Schema Evolution
  • Compressio
  • Partitioning Overview
  • Partitioning in Impala and Hive
  • What is Apache Flume?
  • Basic Flume Architecture
  • Flume Sources
  • Flume Sinks
  • Flume Channels
  • Flume Configuratio
  • What is Apache Spark?
  • Using the Spark Shell
  • RDDs (Resilient Distributed Datasets)
  • Functional Programming in Spark
  • Creating RDDs
  • Other General RDD Operations
  • Spark Applications vs. Spark Shell
  • Creating the SparkContext
  • Building a Spark Application (Scala and Java)
  • Running a Spark Application
  • The Spark Application Web UI
  • Configuring Spark Properties
  • Logging
  • Review: Spark on a Cluster
  • RDD Partitions
  • Partitioning of File-Based RDDs
  • HDFS and Data Locality
  • Executing Parallel Operations
  • Stages and Tasks
  • RDD Lineage
  • RDD Persistence Overview
  • Distributed Persistence
  • Common Spark Use Cases
  • Iterative Algorithms in Spark
  • Graph Processing and Analysis
  • Machine Learning
  • Example: k-means
  • Spark SQL and the SQL Context
  • Creating DataFrames
  • Transforming and Querying DataFrames
  • Saving DataFrames
  • DataFrames and RDDs
  • Comparing Spark SQL, Impala, and Hive-on-Spark

Certification

SpringPeople works with top industry experts to identify the leading certification bodies on different technologies - which are well respected in the industry and globally accepted as clear evidence of a professional’s “proven” expertise in the technology. As such, these certification are a high value-add to the CVs and can give a massive boost to professionals in their career/professional growth.

Our certification courses are fully aligned to these high-profile certification exams; at the end of the course, participants will have detailed knowledge, be eligible and be fully ready take up these certification exams and pass with flying colours.

 

CCA Spark and Hadoop Developer Exam (CCA175)
http://www.cloudera.com/training/certification/cca-spark.html

 

Exam Registration link:

http://www.cloudera.com/training/course-listing.html?course=developer-training-for-spark-and-hadoop&loc=all

 


About the exam:

  • Number of Questions: 10–12 performance-based (hands-on) tasks on CDH5 cluster. See below for full cluster configuration
  • Time Limit: 120 minutes
  • Passing Score: 70%
  • Language: English, Japanese (forthcoming)
  • Price: USD $295


Exam Question Format
Each CCA question requires you to solve a particular scenario. In some cases, a tool such as Impala or Hive may be used. In other cases, coding is required. In order to speed up development time of Spark questions, a template is often provided that contains a skeleton of the solution, asking the candidate to fill in the missing lines with functional code. This template is written in either Scala or Python.

You are not required to use the template and may solve the scenario using a language you prefer. Be aware, however, that coding every problem from scratch may take more time than is allocated for the exam.

Evaluation, Score Reporting, and Certificate
Your exam is graded immediately upon submission and you are e-mailed a score report the same day as your exam. Your score report displays the problem number for each problem you attempted and a grade on that problem. If you fail a problem, the score report includes the criteria you failed (e.g., “Records contain incorrect data” or “Incorrect file format”). We do not report more information in order to protect the exam content. Read more about reviewing exam content on the FAQ.

If you pass the exam, you receive a second e-mail within a few days of your exam with your digital certificate as a PDF, your license number, a Linkedin profile update, and a link to download your CCA logos for use in your personal business collateral and social media profiles

Audience and Prerequisites
There are no prerequisites required to take any Cloudera certification exam. The CCA Spark and Hadoop Developer exam (CCA175) follows the same objectives as Cloudera Developer Training for Spark and Hadoop and the training course is an excellent preparation for the exam.

Course FAQ

  • Our faculties are well experience with over 15+ years of experience in the Industry. Moreover, their experience in the last 4-5 years has been in Big Data and related technologies. This gives them a solid edge and required background and skills to take the batches.
  • The course is offered in both the forms - LVC as well as Instructor-led Classroom at our premises.
  • We cover the latest version, which is Hadoop 2.x at the moment
  • Our course covers basic Hadoop, MapReduce, hive, Hbase, Sqoop, Impala.
  • Our instructors come with a complete instruction manual and guide and support you in doing these installations.
  • You would need at least 8 GB of ram, Intel i5 processor and at least 500 GB HDD to do this installation
  • We have at least 2 projects with huge data sets which will give you a comprehensive real life project experience.
  • We don't have a cloud lab right now but we are in the process of setting it up. Watch out for this space. You will soon hear from us on the same.
  • Our excellent teaching methodology, stressing equally on the fundamental concepts and practical exposure, coupled with project experience makes you fully ready to land into the job market and land yourself a job in big data

Resources

About the Instructor

Founded in 2009, SpringPeople is a global premier eLearning marketplace for Online Live, Instructor-led classes in the region. It is a certified training delivery partner of leading technology creators, namely Pivotal, Elastic, Lightbend, EMC, VMware, MuleSoft, RSA, and... Read More


Course Rating and Reviews

0.0

Average Rating
5 Stars
0
4 Stars
3 Stars
2 Stars
1 Star

SPRINGPEOPLE SpringPeople Trainer

Bharath Vivekanandan

Course:
Instructor:
Course Material:
Class Experience:
Adding one more day to the course duration might help

SPRINGPEOPLE SpringPeople Trainer

Ravi Ranjan Kumar

Engineer
Mindtree Ltd.
Course:
Instructor:
Course Material:
Class Experience:
not required

SPRINGPEOPLE SpringPeople Trainer

Archana M

Team Lead
Mindtree
Course:
Instructor:
Course Material:
Class Experience:
Overall Good Training experience, interactive sessions, got to know many insights of Powershell, Trainer knowledge and explanation was excellent.

This class is intended for participants with some prior exposure to the technology and are now looking to build up their expertise on the topic.

On successful completion of the course, participants will be eligible to sit of the related certification exam (see course overview). All participants receive a course completion certificate, demonstrating their expertise on the subject.

Total duration of the online, live instructor led sessions. Sessions are typically delivered as short lectures (2-hrs weekdays/3-hrs weekends) and detailed hands-on guidance.

Expected offline lab work hours that participants will need to complete and submit to the trainer, during and after the instructor-led online sessions.

  1. We are happy to refund full fee paid - no questions asked - should you feel that the training is not up to your expectations.
  2. Our dedicated team of expert training enablement advisors are available on email, phone and chat to assist you with your queries.
  3. All courseware, including session recordings, will always be available to access to you for future reference and rework.

Contact Us

+91-80-6567-9700 (BLR)

training@springpeople.com

Request Call Back

Related Courses

Recently Viewed