Enterprise Knowledge Management – 1st Edition
Table of Contents
Preface
Chapter 1 – Introduction Data Quality Horror Stories
Knowledge Management and Data Quality
Reasons for Caring about Data Quality
Knowledge Management and Business Rules
Structure of this Book
Chapter 2 – Who Owns Information? The Information Factory
Complicating Notions
Responsibilities of Ownership
Ownership Paradigms
Centralizing, Decentralization and Data Ownership Policies
Ownership and Data Quality
Summary
Chapter 3 – Data Quality in Practice Data Quality Defined: Fitness for Use
The Quality Improvement Program
Data Quality and Operations
Data Quality and Databases
Data Quality and the Data Warehouse
Data Mining
Data Quality and Electronic Data Interchange
Data Quality and the World Wide Web
Summary
Chapter 4 – Economic Framework of Data Quality and the Value Proposition Evidence of Economic Impact
Data Flows and Information Chains
Examples of Information Chains
Impacts
Economic Measures
Impact Domains
Operational Impacts
Tactical and Strategic Impacts
Putting It All Together – the Data Quality Scorecard
Adjusting the Model for Solution Costs
Example
Summary
Chapter 5 – Dimensions of Data Quality Sample Data Application
Data Quality of Data Models
Data Quality of Data Values
Data Quality of Data Domains
Data Quality of Data Presentation
Data Quality of Information Policy
Summary: Importance of the Dimensions of Data Quality
Chapter 6 – Statistical Process Control and the Improvement Cycle Variation and Control
Control Chart
The Pareto Principle
Building a Control Chart
Kinds of Control Charts
Example: Invalid Records
The Goal of Statistical Process Control
Interpreting a Control Chart
Finding Special Causes
Maintaining Control
Summary
Chapter 7 – Domains, Mappings, and Enterprise Reference Data Data Types
Operations
Domains
Mappings
Example: Social Security Numbers
Domains, Mappings, and Metadata
The Publish/Subscribe Model of Reference Data Provision
Summary
Chapter 8 – Data Quality Assertions and Business Rules Data Quality Assertions as Business Rules
The 9 Classes of Data Quality Rules
«Null Value» Rules
Value Manipulation Operators and Functions
Value Rules
Domain Membership Rules
Domain Mappings and Relations on Finite Defined Domains
Relation Rules
Table, Cross-Table, and Cross-Message Assertions
In-Process Rules
Operational Rules
Other Rules
Rule Management, Compilation, and Validation
Rule Ordering
Summary
Chapter 9 – Measurement and Current State Assessment Identify Each Data Customer
Mapping the Information Chain
Choose Locations in the Information Chain
Choose a Subset of the DQ Dimensions
Identify Sentinel Rules
Measuring Data Quality
Measuring Data Quality of Data Models
Measuring Data Quality of Data Values
Measuring Data Quality of Data Domains
Measuring Data Quality of Data Presentation
Measuring Data Quality of Information Policy
Static vs. Dynamic Measurement
Compiling Results
Summary
Chapter 10 – Data Quality Requirements The Assessment Process, Reviewed
Reviewing the Assessment
Determining Expectations
Use Case Analysis
Assignments of Responsibility
Creating Requirements
The Data Quality Requirements
Summary
Chapter 11 – Metadata, Guidelines, and Policy Generic Elements
Data Types and Domains
Schema Metadata
Use and Summarization
Historical
Managing Data Domains
Managing Domain Mappings
Managing Rules
Metadata Browsing
Metadata as a Driver of Policy
Summary
Chapter 12 – Rule-Based Data Quality Rule Basics
What is a Business Rule?
Data Quality Rules are Business Rules (and Vice-Versa)
Advantages of the Rule-Based Approach
Integrating a Rule-Based System
Rule Execution
Deduction vs. Goal-Orientation
Evaluation of a Rules System
Limitations of the Rule-based Approach
Rule Based Data Quality
Summary
Chapter 13 – Metadata and Rule Discovery Domain Discovery
Mapping Discovery
Clustering for Rule Discovery
Key Discovery
Decision and Classification Trees
Association Rules and Data Quality Rules
Summary
Chapter 14 – Data Cleansing Standardization
Common Error Paradigms
Record Parsing
Metadata Cleansing
Data Correction and Enhancement
Approximate Matching and Similarity
Consolidation
Updating Missing Fields
Address Standardization
Summary
Chapter 15 – Root Cause Analysis and Supplier Management What is Root Cause Analysis?
Debugging the Process
Debugging the Problem
Corrective Measures – Resolve or Not?
Supplier Management
Summary
Chapter 16 – Data Enrichment/Enhancement What is Data Enrichment?
Examples of Data Enhancement
Enhancement through Standardization
Enhancement through Provenance
Enhancement through Context
Enhancement through Data Mining
Data Matching, Merging, and Record Linkage
Large Scale Data Aggregation and Linkage
Improving Linkage with Approximate Matching
Enhancement through Inference
Data Quality Rules for Enhancement
Business Rules for Enhancement
Summary
Chapter 17 – Data Quality and Business Rules in Practice Turning Rules into Implementation
Operational Directives
Data Quality and the Transaction Factory
Data Quality and the Data Warehouse
Rules and EDI
Data Quality Rules and Automated UIs
Summary
Chapter 18 – Building the Data Quality Practice Recognize the Problem
Management Support and the Data Ownership Policy
Spread the Word
Mapping the Information Chain
Data Quality Scorecard
Current State Assessment
Requirements Assessment
Choose a Project
Build Your Team
Build Your Arsenal
Metadata Model
Define Data Quality Rules
Archaeology/Data Mining
Manage Your Suppliers
Execute the Improvement
Measure Improvement
Build on Each Success
Conclusion
Description
Today, companies capture and store tremendous amounts of information about every aspect of their business: their customers, partners, vendors, markets, and more. But with the rise in the quantity of information has come a corresponding decrease in its quality–a problem businesses recognize and are working feverishly to solve.
Enterprise Knowledge Management: The Data Quality Approach presents an easily adaptable methodology for defining, measuring, and improving data quality. Author David Loshin begins by presenting an economic framework for understanding the value of data quality, then proceeds to outline data quality rules and domain-and mapping-based approaches to consolidating enterprise knowledge. Written for both a managerial and a technical audience, this book will be indispensable to the growing number of companies committed to wresting every possible advantage from their vast stores of business information.
Key Features
- Expert advice from a highly successful data quality consultant
- The only book on data quality offering the business acumen to appeal to managers and the technical expertise to appeal to IT professionals
- Details the high costs of bad data and the options available to companies that want to transform mere data into true enterprise knowledge
- Presents conceptual and practical information complementing companies’ interest in data warehousing, data mining, and knowledge discovery
Readership
IT, Database, and Business Managers