Data Engineering for Beginners: Learn SQL, Python & Spark
Master SQL, Python, and Apache Spark (PySpark) with Hands-On Projects using Databricks on Google Cloud
4.40 (6158 reviews)

90,459
students
56 hours
content
Mar 2025
last update
$109.99
regular price
What you will learn
Setup Environment to learn SQL and Python essentials for Data Engineering
Database Essentials for Data Engineering using Postgres such as creating tables, indexes, running SQL Queries, using important pre-defined functions, etc.
Data Engineering Programming Essentials using Python such as basic programming constructs, collections, Pandas, Database Programming, etc.
Data Engineering using Spark Dataframe APIs (PySpark) using Databricks. Learn all important Spark Data Frame APIs such as select, filter, groupBy, orderBy, etc.
Data Engineering using Spark SQL (PySpark and Spark SQL). Learn how to write high quality Spark SQL queries using SELECT, WHERE, GROUP BY, ORDER BY, ETC.
Relevance of Spark Metastore and integration of Dataframes and Spark SQL
Ability to build Data Engineering Pipelines using Spark leveraging Python as Programming Language
Use of different file formats such as Parquet, JSON, CSV etc in building Data Engineering Pipelines
Setup Hadoop and Spark Cluster on GCP using Dataproc
Understanding Complete Spark Application Development Life Cycle to build Spark Applications using Pyspark. Review the applications using Spark UI.
Related Topics
3848374
udemy ID
2/14/2021
course created date
4/3/2021
course indexed date
Bot
course submited by