Lightbend Apache Spark for Scala - Professional  Training Logo
Powered By

Lightbend Logo

Lightbend Apache Spark for Scala - Professional Training

Live Online & Classroom Enterprise Training

Powered By

Lightbend Logo

This two-day course, created by Dean Wampler, Ph.D., is designed to teach developers how to implement data processing pipelines and analytics using Apache Spark .

Looking for a private batch ?

REQUEST A CALLBACK
Key Features
  • Lifetime Access

  • CloudLabs

  • 24x7 Support

  • Real-time code analysis and feedback

  • 100% Money Back Guarantee

PDP BG 1
SpringPeople Logo

What is Lightbend Apache Spark for Scala - Professional about?

This two-day course, created by Dean Wampler, Ph.D., is designed to teach developers how to implement data processing pipelines and analytics using Apache Spark . Developers will use hands-on exercises to learn the Spark Core, SQL/DataFrame, Streaming, and MLlib (machine learning) APIs. Developers will also learn about Spark internals and tips for improving application performance. Additional coverage includes integration with Mesos, Hadoop, and Reactive frameworks like Akka.

What are the objectives of Lightbend Apache Spark for Scala - Professional ?

After having participated in this course you should:

  • Understand how to use the Spark Scala APIs to implement various data analytics algorithms for offline (batch-mode) and event-streaming applications
  • Understand Spark internals
  • Understand Spark performance considerations
  • Understand how to test and deploy Spark applications
  • Understand the basics of integrating Spark with Mesos, Hadoop, and Akka
Available Training Modes

Live Online Training

12 Hours

Classroom Training

 

2 Days
PDP BG 2

Who is Lightbend Apache Spark for Scala - Professional for?

  • Anyone who wants to add Lightbend Apache Spark for Scala - Professional Training skills to their profile
  • Teams getting started on Lightbend Apache Spark for Scala - Professional Training projects
  • What are the prerequisites for Lightbend Apache Spark for Scala - Professional ?

    • Experience with Scala, such as completion of Fast Track to Scala course
    • Experience with SQL, machine learning, and other Big Data tools will be helpful, but not required.

    Course Outline

    • Introduction - Why Spark
      • How Spark improves on Hadoop MapReduce
      • The core abstractions in Spark
      • What happens during a Spark job?
      • The Spark ecosystem
      • Deployment options
      • References for more information
    • Spark's Core API
      • Resilient Distributed Datasets (RDD) and how they implement your job
      • Using the Spark Shell (interpreter) vs submitting Spark batch jobs
      • Using the Spark web console.
      • Reading and writing data files
      • Working with structured and unstructured data
      • Building data transformation pipelines
      • Spark under the hood: caching, checkpointing, partitioning, shuffling, etc.
      • Mastering the RDD API
      • Broadcast variables, accumulators
    • Spark SQL and DataFrames
      • Working with the DataFrame API for structured data
      • Working with SQL
      • Performance optimizations
      • Support for JSON and Parquet formats
      • Integration with Hadoop Hive
    • Processing events with Spark Streaming:
      • Working with time slices, "mini-batches", of events
      • Working with moving windows of mini-batches
      • Reuse of code in batch-mode and streaming: the Lambda Architecture
      • Working with different streaming sources: sockets, file systems, Kafka, etc.
      • Resiliency and fault tolerance considerations
      • Stateful transformations (e.g., running statistics)
    • Other Spark-based Libraries:
      • MLlib for machine learning
      • Discussion of GraphX for graph algorithms, Tachyon for distributed caching, and BlinkDB for approximate queries
    • Deploying to clusters:
      • Spark's clustering abstractions: cluster vs. client deployments, coarse-grained and fine-grained process management
      • Standalone mode
      • Mesos
      • Hadoop YARN
      • EC2
      • Cassandra rings
    • Using Spark with the Lightbend Reactive Platform:
      • Akka Streams and Spark Streaming

    Who is the instructor for this training?

    The trainer for this Lightbend Apache Spark for Scala - Professional Training has extensive experience in this domain, including years of experience training & mentoring professionals.

    Reviews