Taming Big Data with Apache Spark and Python - Hands On!

PySpark tutorial with 40+ hands-on examples of analyzing large data sets on your desktop or on Hadoop with Python!
4.52 (16975 reviews)
Udemy
platform
English
language
Data Science
category
Taming Big Data with Apache Spark and Python - Hands On!
107,404
students
9 hours
content
Mar 2025
last update
$119.99
regular price

What you will learn

Use DataFrames and Structured Streaming in Spark 3

Use the MLLib machine learning library to answer common data mining questions

Understand how Spark Streaming lets your process continuous streams of data in real time

Frame big data analysis problems as Spark problems

Use Amazon's Elastic MapReduce service to run your job on a cluster with Hadoop YARN

Install and run Apache Spark on a desktop computer or on a cluster

Use Spark's Resilient Distributed Datasets to process and analyze large data sets across many CPU's

Implement iterative algorithms such as breadth-first-search using Spark

Understand how Spark SQL lets you work with structured data

Tune and troubleshoot large jobs running on a cluster

Share information between nodes on a Spark cluster using broadcast variables and accumulators

Understand how the GraphX library helps with network analysis problems

Screenshots

Taming Big Data with Apache Spark and Python - Hands On! - Screenshot_01Taming Big Data with Apache Spark and Python - Hands On! - Screenshot_02Taming Big Data with Apache Spark and Python - Hands On! - Screenshot_03Taming Big Data with Apache Spark and Python - Hands On! - Screenshot_04
622414
udemy ID
9/25/2015
course created date
8/7/2019
course indexed date
Bot
course submited by