Overview
Data modeling and data warehousing are two essential concepts in the field of data engineering and analytics. A data model is a way of organizing and representing data in a logical and consistent manner. A data warehouse is a centralized repository of integrated and historical data from various sources. In this course, you will learn the fundamentals of data modeling and data warehousing as well as how they can help you design, build, and manage data systems.
Objectives:
- Describe data analytics and its role in modern organizations
- Explain how organizations use data warehousing and data modeling to perform analytics
- Define data modeling and explain its importance in enabling data analytics at scale
- Identify common analytical data modeling terms and techniques
- Describe the “modern data stack” and how analytics and warehousing fit in enterprise data architectures
- Identify, compare, and contrast major data warehousing frameworks, including relational data warehouses, data lakes and data lakehouses
- Define ETL and Reverse ETL and their roles in data warehousing operations.
- Evaluate the quality and performance of data models and data warehouses.
Audience
This course is designed for beginners who want to get started with data engineering and analytics.
Length
2 days
Outline
Day 1: Overview and Data Modeling
Overview
-
- What is data analytics?
- “The New Science of Winning”
- Process execution vs process evaluation
- Types of analytics
- Analytics use cases
- Why is data analytics important?
- The “modern data stack”
- The Old Way: Auditing and Reporting
- The New Way: Analytics Driving Operations
- What is a data model?
- Why is it important in data warehousing and analytics?
- What is data analytics?
Data Modeling for Data Warehouses
-
- Levels of Data Models
- Conceptual
- Logical
- Physical
- Components of a Data Model
- Entities
- Attributes
- Relationships
- Keys
- Diagramming Data Models
- Entity Relationship Diagrams
- Common Symbols
- Types of Data Models
- Relational
- Dimensional
- Other types (Document, Graph)
- Levels of Data Models
Relational (Normalized) Modeling
-
- Normal Forms
- Common Use Cases
- Advantages and Disadvantages
- Hands-On: Explore a 3NF Model
Dimensional Modeling
-
- Facts and Dimensions
- Exercise: Decomposing a 3NF Model into Facts and Dimensions
- Common Dimensional Models
- Star Schema
- Snowflake Schema
- Data Vault
- One Big Table
- Hands-On: Exploring and Building Dimensional Models
Q&A and Discussion
Day 2: Review and Data Warehousing
Review of Day 1
Data Sources
-
- Databases
- Devices
- Files
- Free text / documents
- Images
- Video
- Audio
Data Formats
-
- Structured
- Unstructured
- Semi-Structured
ETL / ELT / Reverse ETL/ELT
-
- In database / outside of database
- Data quality
Data Warehousing
-
- OLTP vs OLAP
- Data warehousing frameworks
- Transactional database
- Relational Data Warehouse
- Data Lake
- Data Lakehouse
- Comparison of framework use cases
- Hands-On:
- Perform ETL using structured and semi-structured data
- From a transactional database to data lake / lakehouse storage
- From data storage to warehouse
- Query data in a data lake, data lakehouse and warehouse
- Reverse ETL from a data warehouse to transactional database
- Perform ETL using structured and semi-structured data
Q&A and Discussion