Become A Data Engineer - Udacity

  1. All Programs
  2. School of data science
  3. Data Engineering with AWS
Data Engineering with AWS

Learn to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets.

  • Nanodegree Program
  • Intermediate
  • 39 hours
  • 4.6 (1264)
  • Updated: Mar 8, 2026

Subscription · Bundle

/ month

Subscription · Monthly

/ month

Individual Course

/ one-time payment

Subscription · Monthly

  • Cancel Anytime
  • Unlimited access to hundreds of top-rated courses
  • Hands-on projects with expert feedback
  • Personalized career coaching and interview prep
  • Program Certificates
Loading...Enroll Now

Skills you'll learn

37 skills

  • Cassandradb
  • PostgreSQL
  • Database normalization
  • Denormalized data schemas
  • Data modeling basics
  • +32 More

Prerequisites

9 prerequisites

Prior to enrolling, you should have the following knowledge:

  • Relational data models
  • Command line interface basics
  • Intermediate Python
  • Relational database basics
  • Basic github
  • +4 More

You will also need to be able to communicate fluently and professionally in written and spoken English.

Download Syllabus

Program Outline

  • 4 courses
  • 22 lessons
  • 4 projects
Course 1: Data Modeling

Learn to create relational and NoSQL data models to fit the diverse needs of data consumers. Use ETL to build databases in PostgreSQL and Apache Cassandra.

9 hours
  1. Introduction to Data Modeling

    In this lesson, students will learn the basic difference between relational and non-relational databases, and how each type of database fits the diverse needs of data consumers.

  2. Relational Data Models

    In this lesson, students understand the purpose of data modeling, the strengths and weaknesses of relational databases, and create schemas and tables in Postgres

  3. NoSQL Data Models

    Students will understand when to use non-relational databases based on the data business needs, their strengths and weaknesses, and how to creates tables in Apache Cassandra.

  4. Data Modeling with Apache Cassandra

    Students will model event data to create a non-relational database and ETL pipeline for a music streaming app. They will define queries and tables for a database built using Apache Cassandra.

Course 2: Cloud Data Warehouses

In this course, you’ll learn to create cloud-based data warehouses. You’ll sharpen your data warehousing skills, deepen your understanding of data infrastructure, and be introduced to data engineering on the cloud using Amazon Web Services (AWS).

12 hours
  1. Introduction to Cloud Data Warehouses

    Welcome to Cloud Data Warehouse with Amazon Web Services. In this lesson, you'll learn more about the course and set yourself up for success!

  2. Introduction to Data Warehouses

    In this lesson, you'll be introduced to the business case for data warehouses as well as architecture, extracting, transforming, and loading data, data modeling, and data warehouse technologies.

  3. ELT and Data Warehouse Technology in the Cloud

    In this lesson, you'll learn about ELT, the differences between ETL and ELT, and general cloud data warehouse technologies.

  4. AWS Data Warehouse Technologies

    In this lesson, you'll learn about AWS Services and how to set up Amazon S3, IAM, VPC, EC2, and RDS. You'll build a Redshift data warehouse cluster and learn how to interact with it.

  5. Implementing a Data Warehouse on AWS

    In this lesson, you'll learn to implement a data warehouse on AWS

  6. Data Warehouse

    In this project, you'll build an ETL pipeline that extracts data from S3, stages data in Redshift, and transforms data into a set of dimensional tables for an analytics team.

Course 3: Spark and Data Lakes

In this course, you will learn about the big data ecosystem and how to use Spark to work with massive datasets. You’ll also learn about how to store big data in a data lake and query it with Spark.

10 hours
  1. Introduction to Spark and Data Lakes

    In this course you'll learn how Spark evaluates code and uses distributed computing to process and transform data. You'll work in the big data ecosystem to build data lakes and data lake houses.

  2. Big Data Ecosystem, Data Lakes, and Spark

    In this lesson, you will learn about the problems that Apache Spark is designed to solve. You'll also learn about the greater Big Data ecosystem and how Spark fits into it.

  3. Spark Essentials

    In this lesson, we'll dive into how to use Spark for wrangling, filtering, and transforming distributed data with PySpark and Spark SQL

  4. Using Spark in AWS

    In this lesson, you will learn to use Spark and work with data lakes with Amazon Web Services using S3, AWS Glue, and AWS Glue Studio.

  5. Ingesting and Organizing Data in a Lakehouse

    In this lesson you'll work with Lakehouse zones. You will build and configure these zones in AWS.

  6. STEDI Human Balance Analytics

    In this project, you'll work with sensor data that trains a machine learning model. You'll load S3 JSON data from a data lake into Athena tables using Spark and AWS Glue.

Course 4: Automate Data Pipelines

Schedule, monitor, and manage data workflows efficiently using tools like Apache Airflow. Build data pipelines by leveraging Airflow DAGs to organize tasks and utilize AWS resources such as S3 and Redshift to process and move data effectively between systems. Engage in hands-on projects to automate and maintain complex data pipelines, streamlining operations and improving data reliability. Gain expertise in workflow automation, data integration, and error handling, enabling you to construct efficient and scalable data pipelines in production environments. Ideal for data engineers and professionals aiming to advance their skills in managing and automating data workflows.

9 hours
  1. Introduction to Automating Data Pipelines

    Welcome to Automating Data Pipelines. In this lesson, you'll be introduced to the topic, prerequisites for the course, and the environment and tools you'll be using to build data pipelines.

  2. Data Pipelines

    In this lesson, you'll learn about the components of a data pipeline including Directed Acyclic Graphs (DAGs). You'll practice creating data pipelines with DAGs and Apache Airflow

  3. Airflow and AWS

    This lesson creates connections between Airflow and AWS first by creating credentials, then copying S3 data, leveraging connections and hooks, and building S3 data to the Redshift DAG.

  4. Data Quality

    Students will learn how to track data lineage and set up data pipeline schedules, partition data to optimize pipelines, investigating Data Quality issues, and write tests to ensure data quality.

  5. Production Data Pipelines

    In this last lesson, students will learn how to build Pipelines with maintainability and reusability in mind. They will also learn about pipeline monitoring.

  6. Data Pipelines

    Students work on a music streaming company’s data infrastructure by creating and automating a set of data pipelines with Airflow, monitoring and debugging production pipelines

Program Instructors

5 instructors

Unlike typical professors, our instructors come from Fortune 500 and Global 2000 companies and have demonstrated leadership and expertise in their professions:

Sean Murdock

Professor at Brigham Young University Idaho

Matt Swaffer

General Manager, MBS

Ben Goldberg

Staff Engineer at SpotHero

Amanda Moran

Developer Advocate at DataStax

Valerie Scarlata

Senior Technical Content Developer at Udacity

Sean Murdock

Professor at Brigham Young University Idaho

Matt Swaffer

General Manager, MBS

Ben Goldberg

Staff Engineer at SpotHero

Amanda Moran

Developer Advocate at DataStax

Valerie Scarlata

Senior Technical Content Developer at Udacity

Reviews

Average Rating: 4.6 (1264 Reviews)
It's a very intense and advance program is highly recommended for the professionals who want to build a career as a perception engineer. It contains all the aspects of self-driving car specialization: computer vision, sensor fusion, localization, planning and control. Additional chapters for career services and interview preparation will help you to start applying for the autonomous driving vacancies really fast after the graduation.

It's a very intense and advance program is highly recommended for the professionals who want to build a career as a perception engineer. It contains all the aspects of self-driving car specialization: computer vision, sensor fusion, localization, planning and control. Additional chapters for career services and interview preparation will help you to start applying for the autonomous driving vacancies really fast after the graduation.

OOleksandr

Oct 22, 2024

Exceptional program that equips you with the skills needed for the future of autonomous vehicles. Challenging projects, great instructors, and a supportive community make it a top choice for anyone passionate about self-driving technology

Exceptional program that equips you with the skills needed for the future of autonomous vehicles. Challenging projects, great instructors, and a supportive community make it a top choice for anyone passionate about self-driving technology

vvivek chavan

Sep 9, 2023

carla simulator failed all the time !!!! wait for several weeks, no way to solve it

carla simulator failed all the time !!!! wait for several weeks, no way to solve it

TTianhui Y.

Apr 10, 2023

This program provide the overall picture of Self Driving Car System. It allowed me to systematically learn the system.

This program provide the overall picture of Self Driving Car System. It allowed me to systematically learn the system.

HHiroki S.

Apr 4, 2023

Need more content for control part of this course(very basic).

Need more content for control part of this course(very basic).

vviswanadh c.

Feb 20, 2023

12345...Page 1Page 2Page 3Page 4Page 5Page 6Page 7Page 8Page 9Page 10Page 11Page 12Page 13Page 14Page 15Page 16Page 17Page 18Page 19Page 20Page 21Page 22Page 23Page 24Page 25Page 26Page 27Page 28Page 29Page 30Page 31Page 32Page 33Page 34Page 35Page 36Page 37Page 38Page 39Page 40Page 41Page 42Page 43Page 44Page 45Page 46Page 47Page 48Page 49Page 50Page 51Page 52Page 53Page 54Page 55Page 56Page 57Page 58Page 59Page 60Page 61Page 62Page 63Page 64Page 65Page 66Page 67Page 68Page 69Page 70Page 71Page 72Page 73Page 74Page 75Page 76Page 77Page 78Page 79Page 80Page 81Page 82Page 83Page 84Page 85Page 86Page 87Page 88Page 89Page 90Page 91Page 92Page 93Page 94Page 95Page 96Page 97Page 98Page 99Page 100Page 101Page 102Page 103Page 104Page 105Page 106Page 107Page 108Page 109Page 110Page 111Page 112Page 113Page 114Page 115Page 116Page 117Page 118Page 119Page 120Page 121Page 122Page 123Page 124Page 125Page 126Page 127Page 128Page 129Page 130Page 131Page 132Page 133Page 134Page 135Page 136Page 137Page 138Page 139Page 140Page 141Page 142Page 143Page 144Page 145Page 146Page 147Page 148Page 149Page 150Page 151Page 152Page 153Page 154Page 155Page 156Page 157Page 158Page 159Page 160Page 161Page 162Page 163Page 164Page 165Page 166Page 167Page 168Page 169Page 170Page 171Page 172Page 173Page 174Page 175Page 176Page 177Page 178Page 179Page 180Page 181Page 182Page 183Page 184Page 185Page 186Page 187Page 188Page 189Page 190Page 191Page 192Page 193Page 194Page 195Page 196Page 197Page 198Page 199Page 200Page 201Page 202Page 203Page 204Page 205Page 206Page 207Page 208Page 209Page 210Page 211Page 212Page 213Page 214Page 215Page 216Page 217Page 218Page 219Page 220Page 221Page 222Page 223Page 224Page 225Page 226Page 227Page 228Page 229Page 230Page 231Page 232Page 233Page 234Page 235Page 236Page 237Page 238Page 239Page 240Page 241Page 242Page 243Page 244Page 245Page 246Page 247Page 248Page 249Page 250Page 251Page 252Page 253253

About this program

Our Data Engineering Nanodegree program is a comprehensive data engineering course designed to teach you how to design data models, build data warehouses and data lakes, automate data pipelines, and work with massive datasets. Skills covered include Database fundamentals, CassandraDB, PostgreSQL, and database normalization. This program is ideal for those with a basic understanding of Python, SQL, and command-line interfaces. You'll learn from industry experts like Sean Murdock, Matt Swaffer, Ben Goldberg, Amanda Moran, and Valerie Scarlata, gaining hands-on experience with real-world projects. At Udacity, we offer an empowering learning environment where you gain practical skills through our data engineering training, reinforced with top-tier support and expert feedback. This course will equip you with the knowledge and tools to excel in the field of data engineering.

Subscription · Bundle

/ month

Subscription · Monthly

/ month

Individual Course

/ one-time payment

Subscription · Monthly

  • Cancel Anytime
  • Unlimited access to hundreds of top-rated courses
  • Hands-on projects with expert feedback
  • Personalized career coaching and interview prep
  • Program Certificates
Loading...Enroll Now

Other programs you might like:

Tag » How To Become A Data Engineer