Course Outline
Introduction
- Apache Spark versus Hadoop MapReduce
Overview of Apache Spark Features and Architecture
Selecting a Programming Language
Setting up Apache Spark
Developing a Sample Application
Choosing the Data Set
Executing Data Analysis
Working with Structured Data via Spark SQL
Handling Streaming Data with Spark Streaming
Integrating Apache Spark with Third-Party Machine Learning Tools
Utilizing Apache Spark for Graph Processing
Performance Optimization
Troubleshooting
Summary and Conclusion
Requirements
- Proficiency with the Linux command line
- Fundamental understanding of data processing concepts
- Programming experience in Java, Scala, Python, or R
Target Audience
- Software Developers
Testimonials (2)
I liked that it was practical. Loved to apply the theoretical knowledge with practical examples.
Aurelia-Adriana - Allianz Services Romania
Course - Python and Spark for Big Data (PySpark)
The fact that we were able to take with us most of the information/course/presentation/exercises done, so that we can look over them and perhaps redo what we didint understand first time or improve what we already did.