The following is the current version of the underlying concepts for each of the Conformed Dimensions of Data Quality (r4.3). The definition of each of the dimensions is available here.
Click on either the Dimension, (e.g. Completeness) or the Underlying Concept (e.g. Record Population) in order to see a list of blogs covering those topics. Alternatively use the site search box (top right) to search for key words you're interested in finding (including any of the dimensions or underlying concepts).
Conformed Dimension | Underlying Concepts | Definition of Underlying Concept |
This measures whether a row is present in a data set (table). | ||
This measures whether a value is present (not null) for an attribute (column). | ||
This measures whether the value contains all characters of the correct value. | ||
Existence identifies whether a real-life fact has been captured as data. | ||
Agree with Real-world | Degree that data factually represents its associated real-world object, event, or concept. | |
Measure of agreement between data and the source of that data. This is used when the data represent intangible objects or transactions that can't be observed visually. | ||
Equivalence of Redundant or Distributed Data | The measure of similarity with other sources of data that represent the same concept. | |
Format Consistency | This measures the conformity of format of the same data in different places. | |
Logical Consistency | Logical consistency measures whether two attributes of related data are conceptually in agreement, even though they may not record the same characteristic of a fact. | |
The measure of uniformity of the data compared to historical values. | ||
Values in Specified Range | Values must be between some lower number and some higher number. | |
Values Conform to Business Rule | Validity measures whether values adhere to some declarative formula. | |
Domain of Predefined Values | This is a set of permitted values. | |
Values Conform to Data Type | Validity measures whether values have a specific characteristic (e.g. Integer, Character, Boolean). Data types restrict what values can exist, the operations that can be use on it, and the way that the data is stored. | |
Values Conform to Format | Validity measures whether the data are arranged or composed in a predefined way. | |
The measure of time between when data is expected versus made available. | ||
Manual float is a measure of the time from when an observation is made to the point it is recorded in electronic format. | ||
Electronic float is a measure of the time from when data is captured in an electronic format until it is accessed by a person. | ||
Data is current if it reflects the present state of the concept it models. | ||
Ease of Obtaining Data | This measures how easy it is to obtain data. | |
Access Control | Access control includes the identification of a person that wants to access data, authentication of their identity, review and authorization to access required data, and lastly auditing the access of that data. | |
Retention | Retention refers to the period of time that data is kept before being removed from a database through purge or archive processing. | |
Referential integrity measures whether if when a value (foreign key) is used it must reference an existing key (primary key) in the parent table. | ||
Uniqueness | Uniqueness measures whether each fact is uniquely represented. | |
Cardinality | Cardinality describes the relationship between one data set and another, such as one-to-one, one-to-many, or many-to-many. | |
The measure of preciseness of numeric data using decimal places, rounding and truncation. | ||
The detail or summary of data defines the granularity measured by the number of attributes used to represent a single concept. | ||
Domain Precision is the granularity for which a concept is represented as an attribute. | ||
Source documentation provides data provenance which describes the origin of the data. | ||
Segment documentation provides how data is transformed and transported from one location to another. | ||
Documentation about the target explains where the data moved to and how it is stored. | ||
End-to-End documentation provides diagrammatic visual representation of how the data flows from beginning to end. | ||
Illustrations and charts should be self-explanatory and presented with appropriate labels, providing context. | ||
Data that is represented well is simple but elegantly formed with good grammar and presented in a standard way. | ||
The appropriate media (e.g. Web-based, hardcopy, or audio…etc) are provided. | ||
Comprehensive descriptions and other information about the characteristics of the data are provided in plain language. | ||
Well represented data includes the scale of measurement, such as weight, height, distance…etc. |
This work is licensed under a Creative Commons Attribution 4.0 International License.