Introduction to Data Modeling and Data Warehousing (2-day)

In this course, students will learn the basic concepts, principles, and techniques of data modeling and data warehousing as well as how to apply them in real-world scenarios. They will also get hands-on experience with some popular tools and platforms for data modeling and data warehousing, such as SQL, ERD, ETL, and AWS.

Overview

Data modeling and data warehousing are two essential concepts in the field of data engineering and analytics. A data model is a way of organizing and representing data in a logical and consistent manner. A data warehouse is a centralized repository of integrated and historical data from various sources. In this course, you will learn the fundamentals of data modeling and data warehousing as well as how they can help you design, build, and manage data systems.

Objectives:

  • Describe data analytics and its role in modern organizations
  • Explain how organizations use data warehousing and data modeling to perform analytics
  • Define data modeling and explain its importance in enabling data analytics at scale
  • Identify common analytical data modeling terms and techniques
  • Describe the “modern data stack” and how analytics and warehousing fit in enterprise data architectures
  • Identify, compare, and contrast major data warehousing frameworks, including relational data warehouses, data lakes and data lakehouses
  • Define ETL and Reverse ETL and their roles in data warehousing operations.
  • Evaluate the quality and performance of data models and data warehouses.

Audience

This course is designed for beginners who want to get started with data engineering and analytics.

Length

2 days

Outline

Day 1: Overview and Data Modeling

Overview

    • What is data analytics?
      • “The New Science of Winning”
      • Process execution vs process evaluation
      • Types of analytics
    • Analytics use cases
    • Why is data analytics important?
    • The “modern data stack”
      • The Old Way: Auditing and Reporting
      • The New Way: Analytics Driving Operations
    • What is a data model?
    • Why is it important in data warehousing and analytics?

Data Modeling for Data Warehouses

    • Levels of Data Models
      • Conceptual
      • Logical
      • Physical
    • Components of a Data Model
      • Entities
      • Attributes
      • Relationships
      • Keys
    • Diagramming Data Models
      • Entity Relationship Diagrams
      • Common Symbols
    • Types of Data Models
      • Relational
      • Dimensional
      • Other types (Document, Graph)

Relational (Normalized) Modeling

    • Normal Forms
    • Common Use Cases
    • Advantages and Disadvantages
    • Hands-On: Explore a 3NF Model

Dimensional Modeling

    • Facts and Dimensions
    • Exercise: Decomposing a 3NF Model into Facts and Dimensions
    • Common Dimensional Models
      • Star Schema
      • Snowflake Schema
      • Data Vault
      • One Big Table
    • Hands-On: Exploring and Building Dimensional Models

Q&A and Discussion

 

Day 2: Review and Data Warehousing

Review of Day 1

Data Sources

    • Databases
    • Devices
    • Files
    • Free text / documents
    • Images
    • Video
    • Audio

Data Formats

    • Structured
    • Unstructured
    • Semi-Structured

ETL / ELT / Reverse ETL/ELT

    • In database / outside of database
    • Data quality

Data Warehousing

    • OLTP vs OLAP
    • Data warehousing frameworks
    • Transactional database
      • Relational Data Warehouse
      • Data Lake
      • Data Lakehouse
    • Comparison of framework use cases
    • Hands-On:
      • Perform ETL using structured and semi-structured data
        • From a transactional database to data lake / lakehouse storage
        • From data storage to warehouse
      • Query data in a data lake, data lakehouse and warehouse
      • Reverse ETL from a data warehouse to transactional database

Q&A and Discussion