Conformed Dimensions of Data Quality- Open Standard

Because we see the general value of using the dimensions of data quality as outlined in the About Dimensions of Data Quality, we believe that these same reasons (see below), and additional reasons, should encourage us to adopt a conformed cross-industry standard set of dimensions. Here are a few reasons why:
 

  • If the dimensions are created with the purpose of ‘communicating’ the characteristics of data then why would we want there to be dimensions with conflicting definitions, or overlapping terminology. The answer is that it is preferred to have a single generally agreed-upon standard.
  • Arguing about what should be in a set of enterprise, or even department level, DQ dimensions wastes time and confuses people who are beginning to learn about DQ. With a standard set of dimensions, organizations can skip over the first wave of arguments and can begin using the terminology and concepts to measure data quality from day one.
  • As new concepts are defined and used there are reasons to protect one’s intellectual property, which in this case of the dimensions of data quality ended 20 or more years.1 There is little value in using one author’s version over another, and on the contrary, use of the most easily understood and widely used is desirable.
  • The reasons that methodologies such as Six Sigma and Lean are so valuable is that they seek to maximize repeatability and standardize inputs, outputs, and processes. Any organization that is test-and-learn focused is also focused on the scientific method of controlling for change in an environment in order to measure change and thereby improvement. This is nearly impossible without standardization, such as provided by the Conformed Dimensions of Data Quality.
  • If over time the organizations use this set of Conformed Dimensions of Data Quality then comparison between departments, companies and even perhaps industries may be feasible.3

Why Now:

So why doesn’t an industry standard already exist? The answer is complicated but in short, there are recommended list of dimensions by organizations2, but they haven’t been successful because, they only reflect the content of a few individuals, rather than a reconciliation of a majority of the research on the dimensions of data quality such as these Conformed Dimensions. The goal of this set is to normalize all author’s valuable insights into one set that is both easy to understand and applicable to day-to-day data quality work done by professionals like yourself.

So what is the solution? It is the Conformed Dimensions. The Conformed Dimensions of Data Quality are composed of the following parts:

image

Dimensions:

The highest level of description is used to broadly categorize observations of quality.

Access the list of Conformed Dimensions and full descriptions here

Underlying Concepts:

The second level is used to break out the distinct components of a dimension.

Access the Underlying Concepts and full descriptions here

Metrics:

The third level is a metric which quantifies a specific aspect of a concept.

Download a list of example metrics (one per Underlying Concept) with full documentation here

 

Citation
1. See the history of the Dimensions of Data Quality page for a comprehensive perspective of additions to this area of study over time.
2. The Data Administration Management Association (DAMA) publication titled the Data Management Body of Knowledge (DM-BOK), lists a set of dimensions, but that set was primarily created by David Loshin (esteemed Data Management author/consultant), but like other author’s works, lacks the comparative value-added aspects that you find in the Conformed Dimensions of Data Quality. Our hope is that at a future date the DM-BOK uses this standard in its entirety.
3. Lee, Pipino, Funk, Wang. Journey to Data Quality, 2006. P. 34 [Log. 47]