Skip to content
Types of dimensions in DWH
Home » Business Intelligence » Exploring Major Types of Dimensions in Data Warehousing

Exploring Major Types of Dimensions in Data Warehousing

Rate this post

In this blog, you will learn about all types of dimensions in data warehousing.

Post Contents

1-Slowly Changing Dimension (SCD)

Type 1 SCD
Type 2 SCD
Type 3 SCD

2-Conformed Dimension
3-Degenerate Dimension
4-Junk Dimension
5-Role-playing Dimension
6-Static Dimension
7-Shrunken Dimension

Let’s explore each type of dimension in a data warehouse, along with examples:

1. Slowly Changing Dimension (SCD):

Slowly Changing Dimensions are dimensions that capture changes over time. There are different types of SCDs based on how they handle changes:

 Type 1 SCD: When a change occurs to a dimension attribute (such as customer name or address), the existing record in the dimension table is immediately updated with the new information.

This update directly replaces the old data with the new data, resulting in the loss of historical changes within the dimension table.

let’s use an example involving a "Customer" dimension table:

Customer ID Customer Name Address City
1 Dineshh 123, street 3 Haridwar
2 Pradeep Phase 2, street 5 Delhi
3 Sandeep Raturi 789 street 7 Bangalore

Now, let’s consider a scenario where there is a change in the customer’s information. Suppose we want to update the "Customer Name"  and "Address" for Customer ID = 1 from "Dineshh" to "Dinesh" , and "123, street 3" to "H No 123, street 3".

In a Type 1 Slowly Changing Dimension (SCD), this update will directly overwrite the existing record in the dimension table without preserving historical information.

After applying this update, the "Customer" dimension table would be modified as follows:

Customer ID Customer Name Address City
1 Dinesh H No 123, street 3 Haridwar
2 Pradeep Phase 2, street 5 Delhi
3 Sandeep Raturi 789 street 7 Bangalore

Type 2 SCD: Maintains historical versions of the dimension by creating new records for each change, along with effective date ranges. This allows for historical analysis and reporting on changes to dimension attributes.

Consider the same Customer table to demonstrate Type 2 SCD.

Customer ID Customer Name Address City
1 Dineshh 123, street 3 Haridwar
2 Pradeep Phase 2, street 5 Delhi
3 Sandeep Raturi 789 street 7 Bangalore

Now, suppose there is a change in the address for Customer ID = 1. In Type 2 SCD, a new record would be created to capture this change while preserving the original record with an effective date range.

Let’s say the Customer Name and address for Customer ID = 1 changes from "Dineshh" to "Dinesh" and  "123, street 3" to "456, street 3". After applying Type 2 SCD, the "Customer" table would be updated as follows:

Customer ID Customer Name Address City Start Date End Date
1 Dineshh 123, street 3 Haridwar 01-Jan-2024 15-Feb-2024
1 Dinesh 456, street 3 Haridwar 16-Feb-2024 (null)
2 Pradeep Phase 2, street 5 Delhi 01-Jan-2024 (null)
3 Sandeep Raturi 789 street 7 Bangalore 01-Jan-2024 (null)

Type 3 SCD: Tracks changes by adding new columns to the dimension table to capture historical information.

Customer ID Customer Name Current Address Current City Previous Address Previous City
1 Dinesh 123, street 3 Haridwar (null) (null)
2 Pradeep Phase 2, street 5 Delhi (null) (null)
3 Sandeep Raturi 789 street 7 Bangalore (null) (null)

Suppose there’s a change in the Address and City for Customer ID = 1 in the future:

Updated Dataset after Change:

Customer ID Customer Name Current Address Current City Previous Address Previous City
1 Dinesh 456, Rock Street 9 NY 123, street 3 Haridwar
2 Pradeep Phase 2, street 5 Delhi (null) (null)
3 Sandeep Raturi 789 street 7 Bangalore (null) (null)



2. Conformed Dimension:

In a data warehouse (DWH), a conformed dimension is a special type of dimension table that gets reused across multiple fact tables. These dimensions help ensure consistency and improve data analysis.

This ensures that dimension attributes, such as dates, are the same everywhere, making it easier to analyze and report data accurately across different departments.

Example rows in the Conformed Time dimension table:

DateKey Date DayOfWeek Month Quarter Year
20240101 2024-01-01 Saturday January Q1 2024
20240102 2024-01-02 Sunday January Q1 2024
20240103 2024-01-03 Monday January Q1 2024

Sales Fact Table:

DateKey ProductKey CustomerKey SalesAmount QuantitySold
20240101 101 201 500.00 3
20240102 102 202 750.00 2
20240103 103 203 300.00 1

Inventory Fact Table:

DateKey ProductKey QuantityInStock UnitsSold UnitsReceived
20240101 101 100 5 20
20240102 102 150 3 10
20240103 103 80 2 15

Customer Transactions Fact table:

DateKey CustomerKey TransactionID TotalAmount PaymentMethod
20240101 201 001 150.00 Credit Card
20240102 202 002 200.00 Cash
20240103 203 003 100.00 Debit Card

In this example:

  • The Conformed Time dimension (DateKey) is referenced in each fact table to provide a consistent and standardized view of time-related attributes (e.g., date, day of week, month) across sales, inventory, and customer transaction data.
  • Each fact table contains specific measures related to its domain (e.g., sales amount, inventory quantities, transaction amounts) and links to other relevant dimensions such as Product and Customer.
  • By using a Conformed Time dimension, organizations can perform integrated analysis and reporting across different areas of their business, leveraging consistent time-related attributes and measures.

3. Degenerate Dimensions:

Degenerate Dimensions are attributes that are part of the fact table instead of being stored in a separate dimension table. They represent dimensions that are “degenerate” in the sense that they do not have their own dimension table.

Example: An invoice number in a Sales fact table is a degenerate dimension because it doesn’t have its own dimension table.

Sales fact table:

TransactionID DateKey ProductKey CustomerKey InvoiceNumber SalesAmount
101 20240101 101 201 INV-1001 500.00
102 20240102 102 202 INV-1002 750.00
103 20240103 103 203 INV-1003 300.00

In this example:

InvoiceNumber is a degenerate dimension because it represents a unique identifier (invoice number) associated with each sales transaction, but it does not have its own dimension table.



4. Junk Dimension:

A junk dimension in data warehousing refers to a single dimension table that combines multiple low-cardinality flags or attributes that do not fit well into existing dimensions.

This approach helps to simplify the data model and reduce the number of dimension tables by grouping related flags or attributes together.

Let’s simplify our data model by organizing promotion-related information into a separate table called a junk dimension. This helps streamline our fact table and makes our data warehouse more efficient for analysis and reporting.

Original Fact Table: “SalesFact”

Consider the following original fact table containing various columns, including promotion-related attributes:

TransactionID DateKey ProductKey CustomerKey SalesAmount IsHoliday IsSeasonal IsDiscount IsFreeShipping
1 20240101 101 201 1500.00 Yes No Yes No
2 20240102 102 202 800.00 No Yes Yes Yes
3 20240103 103 203 2500.00 No No No No
4 20240104 104 204 1200.00 Yes Yes No Yes
5 20240105 105 205 1800.00 Yes No Yes Yes

Step-1:  Steps to Move Promotion Columns to a Junk Dimension: Identify Promotion Attributes: IsHoliday, IsSeasonal, IsDiscount, IsFreeShipping

Step-2: Create a Junk Dimension Table: This table will contain promotion-related attributes.
Junk Dimension Table: “PromotionJunkDim”

PromotionJunkDimKey IsHoliday IsSeasonal IsDiscount IsFreeShipping
1 Yes No Yes No
2 No Yes Yes Yes

Step-3:  Modify the SalesFact table to reference the PromotionJunkDimKey from the PromotionJunkDim table.

Refactored Fact Table: “SalesFact”

TransactionID DateKey ProductKey CustomerKey SalesAmount PromotionJunkDimKey
1 20230101 101 201 1500.00 1
2 20230102 102 202 800.00 2
3 20230103 103 203 2500.00 2
4 20230104 104 204 1200.00 1
5 20230105 105 205 1800.00 2

5. Role-playing Dimension:

A role-playing dimension in data warehousing refers to the practice of using a single dimension table multiple times within a fact table, each time representing a different aspect or role of the dimension.

Fact Table: “SalesFact”

TransactionID OrderDateKey ShipDateKey DeliveryDateKey ProductKey CustomerKey Quantity Amount
1 20240101 20240102 20240103 101 201 2 1500.00
2 20240102 20240103 20240104 102 202 1 800.00

Now, let’s define a Date dimension that is role-played three times in the fact table to represent different dates.

Date Dimension: “DateDim”

DateKey Date DayOfWeek Month Quarter Year
20240101 2024-01-01 Saturday January Q1 2024
20240102 2024-01-02 Sunday January Q1 2024

The DateDim table serves as a role-playing dimension, providing date-related attributes (Date, DayOfWeek, Month, Quarter, Year) that are used multiple times within the fact table to represent different date contexts (Order Date, Ship Date, Delivery Date).



6. Static Dimensions:

Static Dimensions are dimensions that do not change over time and remain constant.

Example: A dimension table containing country names and codes, which rarely change over time.

CountryId CountryName CountryCode
1 United States US
2 India IN
3 United Kingdom UK

7.  Shrunken Dimensions:

Shrunken Dimensions involve combining multiple related dimensions into a single dimension table to simplify the data model and improve performance.

Example: Combining product attributes and supplier attributes into a single dimension table for certain reporting needs.

In this example, we will create a simplified dataset for a shrunken dimension combining product and supplier attributes.

Shrunken Dimension Table: ProductSupplier

productSupplierId productId productName supplierId supplierName supplierCountry
1 101 Laptop 501 Supplier A India
2 102 Smartphone 502 Supplier B Malaysia
3 103 Tablet 503 Supplier C Japan
4 104 Printer 501 Supplier A USA

Thank you for reading this post, and I hope you enjoyed it!

Loading

Leave a Reply

Discover more from Learn BI

Subscribe now to keep reading and get access to the full archive.

Continue reading