Skip to content
Home » Interview Question and Answers » Data Warehouse Fundamentals: Architecture, Modeling, and Interview Questions

Data Warehouse Fundamentals: Architecture, Modeling, and Interview Questions

5/5 - (1 vote)

1. What is a Data Warehouse?

A Data Warehouse (DW) is a centralized repository that stores historical, integrated, and structured data from multiple sources to support reporting, analysis, and decision-making.

Simple definition for interview: A data warehouse is a subject-oriented, integrated, time-variant, and non-volatile system used for analytical reporting and business intelligence.

2. Key Characteristics of Data Warehouse

Interviewers often ask this exactly.

  1. Subject-Oriented
    Data is organized by business subjects like Sales, Finance, Customers.

  2. Integrated
    Data from different sources is standardized and cleaned.

  3. Time-Variant
    Historical data is stored with a time dimension.

  4. Non-Volatile
    Data is not updated or deleted frequently. Mostly append-only.

3. Data Warehouse vs Database (OLAP vs OLTP)

Feature OLTP (Database) OLAP (Data Warehouse)
Purpose Day-to-day operations Analytics & reporting
Data Current Historical
Design Normalized Denormalized
Queries Simple, fast Complex, heavy
Examples Order entry, billing Sales analysis, trends



4. Data Warehouse Architecture (Very Important)

3-Tier Architecture

  1. Bottom Tier (Data Sources)

    • ERP, CRM, Files, APIs, Logs

  2. Middle Tier (ETL / ELT Layer)

    • Extract

    • Transform

    • Load

  3. Top Tier (Presentation Layer)

    • Reports, Dashboards

    • Power BI, Tableau, Excel

5. ETL vs ELT

ETL ELT
Transform before load Transform after load
Used in traditional DW Used in cloud DW
Slower for big data Faster & scalable
Example: SSIS Example: Azure Synapse, Fabric

6. Fact Table and Dimension Table

Fact Table

  • Stores measurable data

  • Numeric values

  • Foreign keys to dimensions

Examples:

  • SalesAmount

  • Quantity

  • Profit

Dimension Table

  • Stores descriptive attributes

  • Used for slicing and filtering

Examples:

  • Customer

  • Product

  • Date

  • Region



7. Star Schema vs Snowflake Schema

Star Schema

  • Fact table in center

  • Directly connected dimensions

  • Simple, fast queries

  • Preferred in Power BI

Snowflake Schema

  • Normalized dimensions

  • More joins

  • Complex but saves space

8. Slowly Changing Dimensions (SCD)

Very common interview question.

Types of SCD

Type Description
Type 0 No changes allowed
Type 1 Overwrite old data
Type 2 Maintain history (StartDate, EndDate, IsCurrent)
Type 3 Limited history (Old value column)

9. Data Warehouse Layers

  1. Staging Layer

    • Raw data

    • No business logic

  2. Data Warehouse Layer

    • Cleaned and transformed data

  3. Data Mart Layer

    • Subject-specific

    • Optimized for reporting

10. Data Mart

A Data Mart is a subset of a data warehouse focused on a specific business area.

Examples:

  • Sales Mart

  • Finance Mart

  • HR Mart

11. Data Warehouse Modeling Approaches

  1. Kimball Approach

    • Bottom-up

    • Dimensional modeling

    • Star schema

  2. Inmon Approach

    • Top-down

    • Normalized (3NF)

    • Enterprise DW first

12. Data Warehouse Indexing & Performance

  • Indexing (Clustered / Non-Clustered)

  • Partitioning (Date-based)

  • Aggregation tables

  • Columnar storage

  • Materialized views

13. Data Quality in Data Warehouse

  • Duplicate handling

  • Null value handling

  • Data validation rules

  • Data profiling

  • Reconciliation reports



14. Data Warehouse Security

  • Row Level Security (RLS)

  • Column Level Security

  • Data masking

  • Encryption (at rest & in transit)

  • Role-based access

15. Common Data Warehouse Tools

On-Prem

  • SQL Server

  • Oracle

  • Teradata

Cloud

  • Azure Synapse

  • Snowflake

  • BigQuery

  • Amazon Redshift

  • Microsoft Fabric Warehouse

16. Data Warehouse vs Data Lake vs Lakehouse

Feature Data Warehouse Data Lake Lakehouse
Data Type Structured All types All types
Schema Schema-on-write Schema-on-read Hybrid
Cost High Low Medium
Use Case BI & reporting Big data & ML BI + ML

Thank you for reading this post, and I hope you enjoyed it!

Loading

Leave a Reply

Discover more from Learn BI

Subscribe now to keep reading and get access to the full archive.

Continue reading