Pivotal Certified Data Science in Practice Training

Live Online & Classroom Certification Training

Pivotal Certified Data Science in Practice course is designed to give the participant hands-on experience with the Pivotal products related to performing Pivotal Data Science projects. Also practice Data Science problem solving techniques to their respective endeavors.

(4.7) 194 Learners
Instructed by SPRINGPEOPLE

No Public/Open-house class on the topic scheduled at the moment!

Course Description


This course is designed to give the participant hands-on experience nwith the Pivotal products related to performing Pivotal Data Science projects. Given the diverse and varying nature of ncustomer implementations this course will focus on the main naspects of a Data Science project within Pivotal: Pivotal nGreenplum DB pSQL MADlib GPText PivotalHD HAWQ nPivotalR pyMADlib with extra units covering Alpine Chorus and nVisualization. Participants are introduced to the need for big fast ndata and its role in modern business applications; The course will nprovide hands on experience using Pivotal Greenplum DB pSQL nMADlib GPText Apache Hadoop Pivotal HD HAWQ Alpine nChorus PivotalR PL\/R pyMADlib PL\/Python and several nvisualization tools such as Gephi D3 and Tableau. This course will nintroduce and use but does not include extensive training on npSQL R Python. Further this course will provide attendees with nan opportunity to explore several intense Data Science projects nthat have been converted into extensive Data Science exercises. nThis course does not teach Installation Configuration and nManagement of any of the products


At the end of Data Science in Practice training course, participants will be able to:

  • Summarize the distinguishing characteristics of each Pivotal product and tool, and be able to describe the most beneficial naspects from a Data Science perspective;
  • Evaluate and demonstrate hands-on practical skills with each nproduct and tool;
  • Investigate, assess, and apply their knowledge to practical data nscience problems;
  • Practice Data Science problem solving techniques to their nrespective endeavors.

As a result of attending the course, the Data Scientist will be able nto confidently utilize the Pivotal product set and related technologies to analyze large data sets.

Suggested Audience -

  • Experienced data analysts and data engineers willing nto work hard to achieve superior Pivotal Data Science nskills.
  • Anyone else who wants to learn about data science nusing the Pivotal product stack.

Duration - 5 Days


  • Willingness to participate in a demanding, high-intensity training experience.
  • Comfort with data analytic technologies a plus n(Statistics, mathematics, machine learning, SQL, R, Python)
  • Have a basic understanding of virtualization and nmassive parallel processing concepts.

Course Curriculum

Expand All
  • Data Science: The Big Picture
  • Driving Forces
  • What Does a Data Scientist Do
  • The Process of Data Science
  • What Does Pivotal bring to the Story
  • Pivotal Corporate Overview
  • The Pivotal Big Data Suite - Pivotal Greenplum DB - tPivotal GPText - tMADlib - tPivotal HD - tPivotal on Virtualized Hardware - tPivotal HAWQ - tPivotal eXtension Framework (PXF) - tPivotal Analytics Workbench - tPivotal GemFire - tPivotal GemFireXD - tSpring by Pivotal - tSpring XD - tPivotal Labs and Pivotal Data Labs -
  • Essentials
  • Getting Started and Inline Lab Exercise
  • Intro to pSQL and Inline Lab Exercises - Creating Tables - tDistributions and Partitioning - tIndexes - tExternal Tables and Loading Data -
  • Unloading Data
  • Analyze
  • Explain and Analyze
  • Vacuum
  • Monitoring
  • Explore and Inline Lab Exercise
  • Joins and Inline Lab Exercise
  • Arrays and Array Aggregates and Inline Lab Exercise
  • Window Functions and Inline Lab Exercise
  • Other Functions and Inline Lab Exercise
  • User Defined Functions (UDF's)
  • User Defined Aggregates (UDA's)
  • Data Science Exercise
  • MADlib Basics
  • Advanced MADlib
  • Data Science Exercise
  • NLP: Practical Examples
  • NLP: Practical Examples with NLTK
  • Putting it all together
  • Data Science Exercise
  • Apache Hadoop Overview
  • - Core Component: HDFS
  • - Core Component: MapReduce
  • - Map Reduce: Writing a Job
  • Hadoop Ecosystem
  • - Hadoop Streaming
  • - Pig
  • Intro to Pivotal HD and HAWQ
  • Getting Started with HAWQ
  • Working with HAWQ
  • External Tables: file, gpfdist, web
  • External Tables: PXF
  • Loading and Unloading Data and Inline Lab Exercises
  • - Loading and Unloading using Copy
  • - Loading and Unloading using Insert
  • - Loading and Unloading using gpfdist / gpload / external tables
  • Data Science Exercise
  • PivotalR
  • PL/R
  • pyMADlib
  • PL/Python
  • Data Science Exercise
  • Tableau
  • R
  • Python
  • Exercises
  • HAWQ Text Analytics Exercise
  • Airline Price Optimization Exercise
  • Gene Sequencing Exercise


SpringPeople works with top industry experts to identify the leading certification bodies on different technologies - which are well respected in the industry and globally accepted as clear evidence of a professional’s “proven” expertise in the technology. As such, these certification are a high value-add to the CVs and can give a massive boost to professionals in their career/professional growth.

Our certification courses are fully aligned to these high-profile certification exams; at the end of the course, participants will have detailed knowledge, be eligible and be fully ready take up these certification exams and pass with flying colours.



SpringPeople Corporate Learning Center

Job Trends

About the Instructor

Founded in 2009, SpringPeople is a global premier eLearning marketplace for Online Live, Instructor-led classes in the region. It is a certified training delivery partner of leading technology creators, namely Pivotal, Elastic, Lightbend, EMC, VMware, MuleSoft, RSA, and... Read More

Course Rating and Reviews


Average Rating
5 Stars
4 Stars
3 Stars
2 Stars
1 Star

SPRINGPEOPLE SpringPeople Trainer


Course Material:
Class Experience:
good training.

SPRINGPEOPLE SpringPeople Trainer


Course Material:
Class Experience:
Excellent training with good walkthrough of many examples.Trainer is very helpful in clearing our doubts.

SPRINGPEOPLE SpringPeople Trainer

Mahender Pandiri

Course Material:
Class Experience:

This class is intended for participants with some prior exposure to the technology and are now looking to build up their expertise on the topic.

On successful completion of the course, participants will be eligible to sit of the related certification exam (see course overview). All participants receive a course completion certificate, demonstrating their expertise on the subject.

Total duration of the online, live instructor led sessions. Sessions are typically delivered as short lectures (2-hrs weekdays/3-hrs weekends) and detailed hands-on guidance.

Expected offline lab work hours that participants will need to complete and submit to the trainer, during and after the instructor-led online sessions.

  1. We are happy to refund full fee paid - no questions asked - should you feel that the training is not up to your expectations.
  2. Our dedicated team of expert training enablement advisors are available on email, phone and chat to assist you with your queries.
  3. All courseware, including session recordings, will always be available to access to you for future reference and rework.

Contact Us

+91-80-6567-9700 (BLR)


Request Call Back

Related Courses

Recently Viewed