Modul 8 - Reference and Master Data - DMBOK2
Modul 8 - Reference and Master Data - DMBOK2
Modul 8 - Reference and Master Data - DMBOK2
Bob Bobson Mars bar Morrisons, Bath 16:00 Monday 3rd January 2011 Cash 1 £0.60
Terminology
Field (or attribute) = column in a database table
Record = row in a database table
About Event Data
› AKA Transaction data Includes information identifying Does not include information
› Describes an action (a verb): the nouns that were involved in describing the nouns:
the event (the Who / What / » Bob Bobson is male, aged
» E.g. “buy” Where / When / How and 25 and works for British
› May include measurements maybe even the Why): Airways
about the action: » Bob Bobson » Monday 3rd Jan 2011 is a
» Quantity bought » Mars bar bank holiday
» Amount paid » Morrisons, Bath » The address of Morrisons
» 16:00 Monday 3rd Jan 2011 Bath is: York Place, London
Road, Bath, BA1 6AE.
» Cash
What is Reference Data…
REFERENCE DATA
REFERENCE DATA? MANAGEMENT ?
» Data that defines the set of permissible control over defined domain values
values to be used by other data fields.
(also known as vocabularies),
» Reference data often is defined including control over standardized
by standards organizations
(such as country codes as
terms, code values and other unique
defined in ISO 3166-1). identifiers, business definitions
for each value, business
» Example: country code,
province code, etc. relationships within and across
domain value lists, and the
consistent, shared use of accurate,
timely and relevant reference data
values to classify and categorize data..
[DAMA, the Data Management
Association]
What is Reference Data…
» Reference data is data used to classify or categorize other data
» Reference data values should conform to a set of allowable data values called
value domain
» Value domain:
Internal standard:
Order Status: New, In Progress, Closed, Cancelled, and so on
External standard (government or industry standard)
Two-letter United States Postal Service standard postal code abbreviations for U.S. states,
such as CA for California
» More than one set of reference data value domains may refer to the same
conceptual domain.
An official name (―California‖).
A legal name (―State of California‖).
A standard postal code abbreviation (―CA‖).
An International Standards Organization (ISO) standard code (―US-CA ‖).
A United States Federal Information Processing Standards (FIPS) code (―06 ‖).
What is Reference Data…
DATA OWNERS
DATA OWNERS
ACTIVE PUSH
SYSTEM Webservices
META-DATA ACTIVE PUSH SYSTEM
Webservices
CATALOGUE
BATCH SCHEDULE BATCH SCHEDULE
LTEETL job
ETL job
EII/DV
SYSTEM EII/DV cache refresh SYSTEM
PULL ON DEMAND
SYSTEM OF PULL ON DEMAND
Webservices
Webservices RECORD SQL queries
SQL queries
SYSTEM SYSTEM
DATA QUALITY USER INTERFACE
DATA STEWARDS
Three standard “Hub” architectures
1.Repository
*A key difference is the
2.Registry number of fields that are
stored centrally
3.Hybrid
Example: Customer
Customer First Last Date of birth Preferred Preferred Credit Occupation Car
code name name delivery address delivery address
line 1 post code rating
BB005 Bob Bobson 1985-12-25 Royal Crescent BA1 7LA A Information Audi R8
Architect
ALL FIELDS
REPOSITORY
IDENTIFIERS
CORE
FIELDS
HYBRID
CORE FIELDS
IDENTIFIERS
REGISTRY
ALL FIELDS
DATA OWNERS
SYSTEM SYSTEM
DATA QUALITY USER INTERFACE
DATA STEWARDS
REPOSITORY
Master Data Environment: › Repository serves as the single source of the master data.
› Repository contains the only version of the master data.
Repository Architecture › All applications use the data in the repository via services.
› No latency or synchronisation.
DATA OWNERS
SYSTEM SYSTEM
DATA QUALITY USER INTERFACE ?
DATA STEWARDS
ACTIVE PUSH
Webservices META-DATA ACTIVE PUSH
SYSTEM
SYSTEM Webservices
CATALOGUE ?
BATCH SCHEDULE
ETL job BATCH SCHEDULE
DISTRIBUTION
ETL job
EII/DV
PULL ON DEMAND
Webservices SYSTEM
SYSTEM PULL ON DEMAND
SRYESPTOESMITO SQL queries
Webservices
SQL queries ORFY RECORD ?
SYSTEM SYSTEM
DATA QUALITY USER INTERFACE ?
DATA STEWARDS
Hybrid Architecture
information.
› Application-specific data is retained only in the application
HYBRID database.
› Applications still manage the full set of data.
› Core information is published back to source systems.
› 2-way synchronisation – normally some latency.
Select your MDM
Architecture
and Toolset
carefully
EXTRACT-TRANSFORM-LOAD
SYNCHRONISATION LAYER
SYNCHRONISATION LAYER
MESSAGING Non
SAP 2
SAP
DV (FEDERATED) MASTER DATA
SAP 3 DW CRM
ORACLE …..
….. …..