Data Design
Data Design
Data Design
Data Architecture is intended to provide a mechanism for the various stakeholders at various levels of Government
to identify, discover, describe, manage, protect, and share the data it has and reuse information consistently within
and across ministry or division of directorate or for the entire Government of Bangladesh.
Data Architecture provides standards for accessing data for online analytical processing (OLAP), including executive
information systems (EIS) and decision support systems (DSS).
To realize the business strategy as defined, and in the long term realize the Vision for Digital Bangladesh by 2021 –
one of the key drivers for ICT is Data domain – the most critical and complex. The data architecture needs to be
defined in a manner that addresses all the challenges of the Government and makes it flexible to adopt in a rapidly
changing business environment
Enable architecture review: Any new system development would require architecture review, data
architecture principles would provide the necessary review parameters as far as database design is
concerned
Provide a guidance mechanism: to database design team or data architect, on what are the criteria that
defines the best the database design
Discover gaps in data security, plan for secured and adaptive data architecture: Data architecture
principles compliance depicts the loopholes in data security, data protection and overall data design
Data access to be based on business rules only, all data access to be made following defined
Description and approved CRUD for all roles accessing the system
1. For new systems implementation – define CRUD and approve from data architect
Implementation 2. Architecture review checklist to include CRUD review
Steps 3. Prohibit data access from ad-hoc query through DBA defined rules
4. Review and enhance existing system’s CRUD
Security breaches frequently occur at data access using ad-hoc queries; use of CRUD would
ensure security
Benefit
Data governance becomes organized and eases data management
Data rights becomes streamlines
Name DP2: Data is an asset, shared and governed
Data is an asset - Data to be cleaned, synchronized and preserved using central data
management tools. Data to be archived as per data archival policy.
Description Data is shared – Data to follow data sharing rules as per data classification
Data is governed – Data stewards to maintain data throughout its life cycle
Data type, length and uniqueness for key and common data entities are aligned to published
Description
National Meta Data Standards
Scope Core and Common data entity
Implementation 1. Draft and Publish Meta Data Standards
Steps 2. Architecture review of Meta Data for new system implementation
Sharing of data becomes easy as there would not be any compatibility issues
Benefit
Eases system development, API development effort
Core data entity must have relationships as per data standards, and established
Description
mechanism to incorporate in the National Master Data Management System
Scope Core and Common data entity
1. Implement MDM platform
Implementation
2. Establish mechanism for data extraction and load to master data management system
Steps
3. Review data cardinality
Ease data integration for effective consumption in reports and analytical tools
Benefit
Checks Data cardinality as per national standard
Description Core data entity must have established identifier to access, store and preserve.
Benefit Ease data integration for effective consumption in reports and analytical tools
Data to be made available to citizens, business or other entities who require the information
as part of their role.
Description
For secured data – proper encryption and security measures for data protection
Content Repository
Content repository would comprise of easy to retrieve, indexed documents, media files, web graphics and templates
Data Models
A data model ensures that data is defined accurately so it is used in the manner intended by both end users and
remote applications.
There are three types of data models –
Conceptual Model
A conceptual data model identifies the highest-level relationships between the different entities. Features
of conceptual data model include the important entities and the relationships among them.
Logical Model
A logical data model describes the data in as much detail as possible, without regard to how they will be
physical implemented in the database
Physical Model
A physical database model shows all table structures, including column name, column data type, column
constraints, primary key, foreign key, and relationships between tables.
Data modeling tools can evaluate an existing database structure and reverse engineer a data model. The reverse
engineered data model can be used to capture valuable information about the existing database.
Data Governance
Data governance encompasses the strategies and technologies used to make sure Government of Bangladesh’s data
stays in compliance with regulations and policies. It is proposed to be a collection of processes, roles, policies,
standards, and metrics to ensure the effective and efficient use of information in enabling Government of
Bangladesh as a whole to achieve its goals.
Master Data refers to those commonly required data, which are agreed upon and shared across the Government. It
may be a reference data such as a list of values to be used for a data element such as sectors in Government. Gartner
defines Master Data as “Master data is the consistent and uniform set of identifiers and extended attributes that
describes the core entities of the enterprise including customers, prospects, citizens, suppliers, sites, hierarchies and
chart of accounts”
It is proposed that Government of Bangladesh follow a Virtual Master Data Management Architecture with the
following capabilities:
Master Data Virtualization Service: The service would enable same data acquisition from multiple data source, yet
maintaining a single master following Federated Architecture. As an example, each ministry follows its own codes
and identifier for similar entities such as Districts, the virtualization service would enable a mapping with all those
different entities for the same district, for example, Code 4 in Finance Ministry might represent Khulna while Khulna
is represented as Code 6 in Social Welfare ministry. The virtualization service would enable the mapping of Khulna
with all the ministries
MDM Repository: The repository to store and preserve the master data to enable single source of truth view
Data Synchronization: The data synchronization and clean up would enable cleaning of data entities from say free
form text entry, this would also enable meta data standard compliance
Data Conflict Resolution: The tool/capability would display the data conflict among various sources of same master
data to help resolve conflict and preserve the right data
Data Warehouse
A data warehouse is a collection of data designed to support decision-making and analytical processing. Data
warehouses contain a wide variety of data, usually from multiple data sources, presenting a comprehensive view of
a particular business environment. Due to the nature of the data stored in a data warehouse, the size of the data
warehouse is usually very large, so it requires special design and planning.
A data mart is a subset of a data warehouse. Where data warehouses are designed to support many requirements
for multiple business needs, data marts are designed to support specific requirements for specific decision support
applications (i.e., particular business needs). Although a data mart is a subset of a data warehouse, it is not
necessarily smaller than a data warehouse. Specific decision support needs may still require large amounts of data.
Data marts are typically considered a solution for distributed users who want exclusive control of the information
required for their business need.
Data warehouse efforts should begin with a specific requirement for a specific decision support application, similar
to the practices of a data mart design. For scalability, the tools and databases used should be designed to support a
very large data warehouse, instead of using data mart specific products.
Future State Data Architecture Model