cial for tracking business objects both in time—answering questions like “Is this client still active?”—and in space across the entire data pipeline, such as “Is this lead defined by marketing the same as the lead presented in this report?” Hubs play a critical role in consolidating diverse data sources, ensuring a uni- fied and consistent granularity for each business concept. The separation of business keys into Hubs emphasizes their central importance in identifying and managing business objects. A Hub entity is designed to store these keys alongside essential metadata (Section 2.1.2), including the source system (Record Source) and the timestamp when the business key was first in- troduced to the data platform (Load Date). This structure not only simplifies the integration of new data sources—where new business keys can easily be added to existing Hubs—but also provides the necessary granularity for downstream objects, such as dimension tables. This granularity is vital for efficient query- ing and minimizes computational overhead, making Hubs indispensable for maintaining the integrity and performance of the Enterprise Data Warehouse.
HUB STRUCTURE
— Hash Key: The primary key for the Hub, generated by hashing the business key, ensuring uniqueness and consistency across all data sources
— Load Date TS: A timestamp marking when the business key was loaded into the data platform, crucial for tracking and batch management
— Record Source: Identifies the originating system or application, providing traceability and ensuring data lineage
— Business Key: The original value from the source system representing a core business concept
31/60
THE DATA VAULT HANDBOOK © SCALEFREE INTERNATIONAL GMBH 2025
Powered by FlippingBook