PySpark Essentials for Data Scientists (Big Data + Python)
Learn how to wrangle Big Data for Machine Learning using Python in PySpark taught by an industry expert!
4.42 (820 reviews)

5,639
students
17.5 hours
content
May 2022
last update
$54.99
regular price
What you will learn
Use Python with Big Data on a distributed framework (Apache Spark)
Work with REAL datasets on realistic consulting projects
How to streaming LIVE data from Twitter using Spark Structured Streaming
Learn how to create a "Pandora Like" app that classifies songs into genres using machine learning
Flag suspicious job postings using Natural Language Processing
Use machine learning to predict optimal cement strength and the factors that affect it
Classify Christmas cooking recipes using Topic Modeling (LDA)
Customer Segmentation using Gaussian Mixture Modeling (Clustering)
Use cluster analysis to develop a strategy designed to increase college graduation rates for under-priveleged populations
How to use the k-means clustering algorithm to define a marketing outreach strategy
Integrate a UI to monitor your model training and development process with MLflow
Theory and application of cutting edge data science algorithms
Manipulate, Join and Aggregate Dataframes in Spark with Python
Learn how to apply Spark's machine learning techniques on distributed Dataframes
Cross Validation & Hyperparameter Tuning
Frequent Pattern Mining Techniques
Classification & Regression Techniques
Data Wrangling for Natural Language Processing
How to write SQL Queries in Spark
Screenshots




Related Topics
2839728
udemy ID
2/27/2020
course created date
10/4/2020
course indexed date
Bot
course submited by