In this blog, you will learn about all types of dimensions in data warehousing.
Post Contents
1-Slowly Changing Dimension (SCD)
Type 1 SCD
Type 2 SCD
Type 3 SCD
2-Conformed Dimension
3-Degenerate Dimension
4-Junk Dimension
5-Role-playing Dimension
6-Static Dimension
7-Shrunken Dimension
Let’s explore each type of dimension in a data warehouse, along with examples:
1. Slowly Changing Dimension (SCD):
Slowly Changing Dimensions are dimensions that capture changes over time. There are different types of SCDs based on how they handle changes:
 Type 1 SCD: When a change occurs to a dimension attribute (such as customer name or address), the existing record in the dimension table is immediately updated with the new information.
This update directly replaces the old data with the new data, resulting in the loss of historical changes within the dimension table.
let’s use an example involving a "Customer" dimension table:
| Customer ID | Customer Name | Address | City |
|---|---|---|---|
| 1 | Dineshh | 123, street 3 | Haridwar |
| 2 | Pradeep | Phase 2, street 5 | Delhi |
| 3 | Sandeep Raturi | 789 street 7 | Bangalore |
Now, let’s consider a scenario where there is a change in the customer’s information. Suppose we want to update the "Customer Name"Â and "Address" for Customer ID = 1 from "Dineshh" to "Dinesh" , and "123, street 3" to "H No 123, street 3".
In a Type 1 Slowly Changing Dimension (SCD), this update will directly overwrite the existing record in the dimension table without preserving historical information.
After applying this update, the "Customer" dimension table would be modified as follows:
| Customer ID | Customer Name | Address | City |
|---|---|---|---|
| 1 | Dinesh | H No 123, street 3 | Haridwar |
| 2 | Pradeep | Phase 2, street 5 | Delhi |
| 3 | Sandeep Raturi | 789 street 7 | Bangalore |
Type 2 SCD: Maintains historical versions of the dimension by creating new records for each change, along with effective date ranges. This allows for historical analysis and reporting on changes to dimension attributes.
Consider the same Customer table to demonstrate Type 2 SCD.
| Customer ID | Customer Name | Address | City |
|---|---|---|---|
| 1 | Dineshh | 123, street 3 | Haridwar |
| 2 | Pradeep | Phase 2, street 5 | Delhi |
| 3 | Sandeep Raturi | 789 street 7 | Bangalore |
Now, suppose there is a change in the address for Customer ID = 1. In Type 2 SCD, a new record would be created to capture this change while preserving the original record with an effective date range.
Let’s say the Customer Name and address for Customer ID = 1 changes from "Dineshh" to "Dinesh" and "123, street 3" to "456, street 3". After applying Type 2 SCD, the "Customer" table would be updated as follows:
| Customer ID | Customer Name | Address | City | Start Date | End Date |
|---|---|---|---|---|---|
| 1 | Dineshh | 123, street 3 | Haridwar | 01-Jan-2024 | 15-Feb-2024 |
| 1 | Dinesh | 456, street 3 | Haridwar | 16-Feb-2024 | (null) |
| 2 | Pradeep | Phase 2, street 5 | Delhi | 01-Jan-2024 | (null) |
| 3 | Sandeep Raturi | 789 street 7 | Bangalore | 01-Jan-2024 | (null) |
Type 3 SCD: Tracks changes by adding new columns to the dimension table to capture historical information.
| Customer ID | Customer Name | Current Address | Current City | Previous Address | Previous City |
|---|---|---|---|---|---|
| 1 | Dinesh | 123, street 3 | Haridwar | (null) | (null) |
| 2 | Pradeep | Phase 2, street 5 | Delhi | (null) | (null) |
| 3 | Sandeep Raturi | 789 street 7 | Bangalore | (null) | (null) |
Suppose there’s a change in the Address and City for Customer ID = 1 in the future:
Updated Dataset after Change:
| Customer ID | Customer Name | Current Address | Current City | Previous Address | Previous City |
|---|---|---|---|---|---|
| 1 | Dinesh | 456, Rock Street 9 | NY | 123, street 3 | Haridwar |
| 2 | Pradeep | Phase 2, street 5 | Delhi | (null) | (null) |
| 3 | Sandeep Raturi | 789 street 7 | Bangalore | (null) | (null) |
2. Conformed Dimension:
In a data warehouse (DWH), a conformed dimension is a special type of dimension table that gets reused across multiple fact tables. These dimensions help ensure consistency and improve data analysis.
This ensures that dimension attributes, such as dates, are the same everywhere, making it easier to analyze and report data accurately across different departments.
Example rows in the Conformed Time dimension table:
| DateKey | Date | DayOfWeek | Month | Quarter | Year |
|---|---|---|---|---|---|
| 20240101 | 2024-01-01 | Saturday | January | Q1 | 2024 |
| 20240102 | 2024-01-02 | Sunday | January | Q1 | 2024 |
| 20240103 | 2024-01-03 | Monday | January | Q1 | 2024 |
| … | … | … | … | … | … |
Sales Fact Table:
| DateKey | ProductKey | CustomerKey | SalesAmount | QuantitySold |
|---|---|---|---|---|
| 20240101 | 101 | 201 | 500.00 | 3 |
| 20240102 | 102 | 202 | 750.00 | 2 |
| 20240103 | 103 | 203 | 300.00 | 1 |
| … | … | … | … | … |
Inventory Fact Table:
| DateKey | ProductKey | QuantityInStock | UnitsSold | UnitsReceived |
|---|---|---|---|---|
| 20240101 | 101 | 100 | 5 | 20 |
| 20240102 | 102 | 150 | 3 | 10 |
| 20240103 | 103 | 80 | 2 | 15 |
| … | … | … | … | … |
Customer Transactions Fact table:
| DateKey | CustomerKey | TransactionID | TotalAmount | PaymentMethod |
|---|---|---|---|---|
| 20240101 | 201 | 001 | 150.00 | Credit Card |
| 20240102 | 202 | 002 | 200.00 | Cash |
| 20240103 | 203 | 003 | 100.00 | Debit Card |
| … | … | … | … | … |
In this example:
- The Conformed Time dimension
(DateKey)is referenced in each fact table to provide a consistent and standardized view of time-related attributes (e.g., date, day of week, month) across sales, inventory, and customer transaction data. - Each fact table contains specific measures related to its domain (e.g., sales amount, inventory quantities, transaction amounts) and links to other relevant dimensions such as Product and Customer.
- By using a Conformed Time dimension, organizations can perform integrated analysis and reporting across different areas of their business, leveraging consistent time-related attributes and measures.
3. Degenerate Dimensions:
Degenerate Dimensions are attributes that are part of the fact table instead of being stored in a separate dimension table. They represent dimensions that are “degenerate” in the sense that they do not have their own dimension table.
Example: An invoice number in a Sales fact table is a degenerate dimension because it doesn’t have its own dimension table.
Sales fact table:
| TransactionID | DateKey | ProductKey | CustomerKey | InvoiceNumber | SalesAmount |
|---|---|---|---|---|---|
| 101 | 20240101 | 101 | 201 | INV-1001 | 500.00 |
| 102 | 20240102 | 102 | 202 | INV-1002 | 750.00 |
| 103 | 20240103 | 103 | 203 | INV-1003 | 300.00 |
In this example:
InvoiceNumber is a degenerate dimension because it represents a unique identifier (invoice number) associated with each sales transaction, but it does not have its own dimension table.
4. Junk Dimension:
A junk dimension in data warehousing refers to a single dimension table that combines multiple low-cardinality flags or attributes that do not fit well into existing dimensions.
This approach helps to simplify the data model and reduce the number of dimension tables by grouping related flags or attributes together.
Let’s simplify our data model by organizing promotion-related information into a separate table called a junk dimension. This helps streamline our fact table and makes our data warehouse more efficient for analysis and reporting.
Original Fact Table: “SalesFact”
Consider the following original fact table containing various columns, including promotion-related attributes:
| TransactionID | DateKey | ProductKey | CustomerKey | SalesAmount | IsHoliday | IsSeasonal | IsDiscount | IsFreeShipping |
|---|---|---|---|---|---|---|---|---|
| 1 | 20240101 | 101 | 201 | 1500.00 | Yes | No | Yes | No |
| 2 | 20240102 | 102 | 202 | 800.00 | No | Yes | Yes | Yes |
| 3 | 20240103 | 103 | 203 | 2500.00 | No | No | No | No |
| 4 | 20240104 | 104 | 204 | 1200.00 | Yes | Yes | No | Yes |
| 5 | 20240105 | 105 | 205 | 1800.00 | Yes | No | Yes | Yes |
Step-1:Â Steps to Move Promotion Columns to a Junk Dimension: Identify Promotion Attributes: IsHoliday, IsSeasonal, IsDiscount, IsFreeShipping
Step-2: Create a Junk Dimension Table: This table will contain promotion-related attributes.
Junk Dimension Table: “PromotionJunkDim”
| PromotionJunkDimKey | IsHoliday | IsSeasonal | IsDiscount | IsFreeShipping |
|---|---|---|---|---|
| 1 | Yes | No | Yes | No |
| 2 | No | Yes | Yes | Yes |
Step-3:Â Modify the SalesFact table to reference the PromotionJunkDimKey from the PromotionJunkDim table.
Refactored Fact Table: “SalesFact”
| TransactionID | DateKey | ProductKey | CustomerKey | SalesAmount | PromotionJunkDimKey |
|---|---|---|---|---|---|
| 1 | 20230101 | 101 | 201 | 1500.00 | 1 |
| 2 | 20230102 | 102 | 202 | 800.00 | 2 |
| 3 | 20230103 | 103 | 203 | 2500.00 | 2 |
| 4 | 20230104 | 104 | 204 | 1200.00 | 1 |
| 5 | 20230105 | 105 | 205 | 1800.00 | 2 |
5. Role-playing Dimension:
A role-playing dimension in data warehousing refers to the practice of using a single dimension table multiple times within a fact table, each time representing a different aspect or role of the dimension.
Fact Table: “SalesFact”
| TransactionID | OrderDateKey | ShipDateKey | DeliveryDateKey | ProductKey | CustomerKey | Quantity | Amount |
|---|---|---|---|---|---|---|---|
| 1 | 20240101 | 20240102 | 20240103 | 101 | 201 | 2 | 1500.00 |
| 2 | 20240102 | 20240103 | 20240104 | 102 | 202 | 1 | 800.00 |
Now, let’s define a Date dimension that is role-played three times in the fact table to represent different dates.
Date Dimension: “DateDim”
| DateKey | Date | DayOfWeek | Month | Quarter | Year |
|---|---|---|---|---|---|
| 20240101 | 2024-01-01 | Saturday | January | Q1 | 2024 |
| 20240102 | 2024-01-02 | Sunday | January | Q1 | 2024 |
The DateDim table serves as a role-playing dimension, providing date-related attributes (Date, DayOfWeek, Month, Quarter, Year) that are used multiple times within the fact table to represent different date contexts (Order Date, Ship Date, Delivery Date).
6. Static Dimensions:
Static Dimensions are dimensions that do not change over time and remain constant.
Example: A dimension table containing country names and codes, which rarely change over time.
| CountryId | CountryName | CountryCode |
|---|---|---|
| 1 | United States | US |
| 2 | India | IN |
| 3 | United Kingdom | UK |
7. Shrunken Dimensions:
Shrunken Dimensions involve combining multiple related dimensions into a single dimension table to simplify the data model and improve performance.
Example: Combining product attributes and supplier attributes into a single dimension table for certain reporting needs.
In this example, we will create a simplified dataset for a shrunken dimension combining product and supplier attributes.
Shrunken Dimension Table: ProductSupplier
| productSupplierId | productId | productName | supplierId | supplierName | supplierCountry |
|---|---|---|---|---|---|
| 1 | 101 | Laptop | 501 | Supplier A | India |
| 2 | 102 | Smartphone | 502 | Supplier B | Malaysia |
| 3 | 103 | Tablet | 503 | Supplier C | Japan |
| 4 | 104 | Printer | 501 | Supplier A | USA |
Thank you for reading this post, and I hope you enjoyed it!
![]()
