Data Governance · Quality Framework

Data
Quality

Data quality is the degree to which data is fit for its intended use across nine measurable dimensions — Accuracy, Completeness, Consistency, Timeliness, Validity, Uniqueness, Integrity, Reliability, and Accessibility. This page covers each dimension in depth: what it means, how to measure it, and how to govern it in enterprise systems.

Data Quality Attributes and Dimensions – visual framework showing all 9 quality properties including Accuracy, Completeness, Consistency, Timeliness, Validity, Uniqueness, Integrity, Reliability and Accessibility
View full size

Data Quality Dimensions Framework · Click to enlarge · 7 image sizes available

The Cost of Poor Data Quality

Bad data is a $12.9 million annual problem per organization.

  • Inaccurate data leads to wrong business decisions — wrong customer targeting, incorrect financial reporting, misguided strategy.
  • Incomplete records cause failed transactions, missed compliance requirements, and broken downstream pipelines.
  • Duplicate customer records distort analytics, inflate marketing costs, and frustrate end users.
  • Stale data causes decisions made on yesterday's reality — dangerous in finance, healthcare, and supply chain.
  • Inconsistent data across systems creates contradictory reports and destroys trust in analytics platforms.

With a Data Quality Framework

Nine dimensions give you a measurable, actionable quality scorecard.

  • Automated quality rules enforce accuracy and validity constraints at ingestion — bad data never enters the pipeline.
  • Completeness checks alert teams to missing critical fields before downstream processes consume incomplete records.
  • Deduplication logic enforces uniqueness — every entity is represented exactly once across all systems.
  • Timeliness monitoring tracks data latency SLAs — stakeholders get alerts when freshness drops below thresholds.
  • Cross-system consistency reconciliation detects and resolves contradictions before they reach dashboards and reports.

The Nine Dimensions

Data quality attributes in depth

Each dimension addresses a distinct type of data problem. Together they form a complete quality scorecard that gives data teams, analysts, and governance officers a shared vocabulary for measuring and improving data fitness.

Framework

Data Quality
Dimensions

A structured, multi-dimensional framework for measuring whether data is fit for its intended business use — spanning correctness, completeness, structure, freshness, uniqueness, and trustworthiness.

9
Quality Dimensions
6
Core DAMA Dimensions
Data Types Supported

Applies To

Structured DataUnstructured Data Streaming DataMaster Data AnalyticsAI/ML Pipelines ComplianceReporting
🎯
Dimension 01
Accuracy
"Does the data correctly reflect reality?"

Accuracy is the degree to which data values correctly represent the real-world entities, events, or facts they describe. A customer's phone number is accurate if it matches their actual number. A transaction amount is accurate if it equals what was actually charged. Accuracy is often the first dimension organizations focus on — and the hardest to automate, because it requires comparison against a trusted ground-truth source.

% of values matching a trusted reference source
Error rate vs. audit sample or ground truth
Cross-validation against external authoritative data
Manual data entry errors and typos
Stale cached values after real-world changes
Transformation logic bugs in ETL pipelines
98%
Ground TruthReference MatchingError RateValidation
Dimension 02
Completeness
"Is all required data present?"

Completeness measures whether all required data is present — no mandatory fields are null, no expected records are missing, and all attributes that should have values are populated. Evaluated at three levels: attribute-level (is this field populated?), record-level (is this row complete?), and dataset-level (are all expected records present?). Incompleteness in critical data has cascading effects on downstream analytics and compliance reporting.

% of required fields populated (null rate)
% of expected records present in dataset
Referential completeness — all foreign keys resolve
Optional fields left blank in intake forms
Pipeline failures that drop records silently
Missing records from failed API calls
95%
Null RateRecord CountMandatory FieldsReferential
🔗
Dimension 03
Consistency
"Does data agree across systems and time?"

Consistency ensures that data does not contradict itself across datasets, systems, formats, or time periods. A customer's address must match between CRM, ERP, and billing systems. Date formats must be uniform throughout a dataset. Value representations (units, currency, naming conventions) must align. Inconsistency is especially prevalent in multi-system environments where the same entity is stored in multiple places without a master record to synchronize them.

Cross-system reconciliation — value agreement %
Format consistency checks (date, currency, units)
Naming convention adherence across schemas
Same entity updated in one system but not others
Format drift (MM/DD vs DD/MM) across sources
Schema naming changes breaking joins
90%
Cross-systemFormat StandardsReconciliationMDM
⏱️
Dimension 04
Timeliness
"Is data available when it's needed?"

Timeliness measures whether data is available and up-to-date at the time it is needed for its intended use. This is a user-expectation dimension: if a dashboard promises daily refresh but data is 3 days old when consulted, it fails the timeliness requirement. Distinct from currency (which measures whether data reflects the current real-world state), timeliness focuses on whether the data was ready when the decision needed to be made.

Data latency: event-to-availability gap
SLA compliance rate for refresh schedules
Freshness age at point of consumption
Pipeline delays or failures missing refresh windows
Manual batch processes creating data gaps
Downstream caches not invalidating after updates
88%
LatencySLA MonitoringFreshnessReal-time
📐
Dimension 05
Validity
"Does data conform to defined rules and formats?"

Validity measures whether data values conform to predefined data types, formats, value ranges, and business rules. A US ZIP code is valid if it contains exactly 5 or 9 digits. An email address is valid if it matches the RFC 5322 format. A transaction amount is valid if it falls within an acceptable business range. Validity is structural correctness — separate from accuracy (whether the value is actually correct). A value can be valid but wrong, or wrong in format but factually correct.

% of values passing format/type/range rules
Business rule violation count per dataset
Schema constraint violation rate
Wrong data types in ingested fields
Values outside permitted ranges
Business rule violations missed at entry
92%
Business RulesData TypesFormat ChecksRanges
🔑
Dimension 06
Uniqueness
"Is each entity represented exactly once?"

Uniqueness ensures that each real-world entity — customer, product, transaction — appears exactly once in a dataset. Duplicate records distort analytics (inflated customer counts, skewed revenue figures), cause operational failures (sending the same email twice), and corrupt machine learning training sets. Uniqueness is measured across records in a single dataset and across datasets that share entity references.

Duplication rate: % of records that are duplicates
Primary key uniqueness violation count
Fuzzy match score for near-duplicate detection
Same record inserted multiple times
Migration without deduplication logic
Multiple source systems creating overlapping IDs
87%
DeduplicationPrimary KeysMDMFuzzy Match
🏛️
Dimension 07
Integrity
"Are data relationships correctly maintained?"

Data integrity ensures that relationships between data elements are correctly maintained — particularly referential integrity (foreign keys that point to valid parent records), relational constraints (one-to-many relationships obeyed), and cross-system coherence (related records in different databases are synchronized). A dataset can pass accuracy, completeness, and uniqueness checks but still have broken integrity if an order references a non-existent customer ID.

Referential integrity violations (orphaned FK count)
Constraint violation rate (NOT NULL, UNIQUE, FK)
Cross-table relationship consistency checks
Deleting parent records without cascading
Cross-system sync failures creating orphans
Data migrations without FK validation
85%
Referential IntegrityFK ConstraintsRelationalOrphan Detection
🛡️
Dimension 08
Reliability
"Is this data from a trustworthy, credible source?"

Reliability measures the trustworthiness and credibility of data — whether it consistently meets quality expectations over time and comes from verifiable, authoritative sources. Reliable data is produced by consistent, well-governed processes with clear provenance. It is the expectation that quality will be maintained across data refresh cycles, not just at a point in time. Reliability is closely related to data lineage — you need to know where data came from to trust it.

% of time data meets defined quality thresholds
Source provenance and lineage coverage
Historical quality score variance (stability)
Undocumented or unknown data sources
Inconsistent quality across refresh cycles
Broken data lineage after schema changes
83%
Data LineageProvenanceTrust ScoreGovernance
🔓
Dimension 09
Accessibility
"Can authorized users get the data they need?"

Accessibility measures whether data is available to authorized users in the format they need, at the time they need it, without unnecessary barriers. High-quality data that cannot be accessed by the people who need it delivers zero value. Accessibility encompasses query performance, API availability, data format usability, permission management, and documentation quality. It is especially critical for self-service analytics environments where non-technical users consume data directly.

Data access fulfillment time per request
% of datasets with up-to-date data dictionaries
Access permission coverage and RBAC completeness
Slow query performance on large datasets
Missing documentation preventing self-service
Overly restrictive permissions blocking valid use
80%
RBACData CatalogQuery PerformanceSelf-service

Comparison Matrix

All 9 dimensions at a glance

A structured comparison of all nine data quality dimensions — showing who cares most, the primary metric, and the priority level for typical enterprise deployments.

Dimension Core Question Primary Metric Who Cares Most Primary Failure Priority
Accuracy Is the value correct? % matching ground truth Analytics, finance, compliance Wrong business decisions Critical
Completeness Is anything missing? Null rate by field Data engineering, compliance Broken pipelines, failed reporting Critical
Consistency Does data agree across systems? Cross-system match rate BI teams, MDM, integration Contradictory reports Critical
Timeliness Is data fresh when needed? Data latency / SLA compliance Operations, trading, logistics Stale decisions High
Validity Does data follow the rules? Rule violation rate Data engineers, governance Format errors breaking systems Critical
Uniqueness Are there duplicates? Duplication rate Marketing, CRM, ML teams Inflated counts, waste High
Integrity Are relationships intact? Orphaned FK count DBAs, data engineers Broken joins and queries High
Reliability Is the source trustworthy? Quality score stability Data governance, executives Loss of data trust High
Accessibility Can users get the data? Access fulfillment time Self-service users, analysts Unused high-quality data Medium

DataKnobs Platform

All 9 dimensions — monitored, enforced, and governed in one platform

DataKnobs Kreate, Kontrols, and Knobs embed data quality enforcement across every dimension — automatically — so that high-quality data reaches every downstream consumer, AI model, and decision-maker.

  • Kreate builds data pipelines with quality gates at every transformation step — accuracy checks, completeness alerts, validity rule enforcement, and deduplication built in.
  • Kontrols enforces dimensional quality policies, maintains audit trails, monitors timeliness SLAs, and generates compliance-ready quality reports across all datasets.
  • Knobs tunes quality thresholds, alert sensitivity, and rule parameters in production — without pipeline redeployment — as business requirements evolve.
Kreate

Build data pipelines with quality enforcement — accuracy validation, completeness checks, validity rules, and uniqueness deduplication at every stage.

Kontrols

Policy-driven quality governance — dimensional scoring, timeliness SLA monitoring, audit trails, and compliance-ready quality reports.

Knobs

Tune quality thresholds, alert rules, and validation parameters in production — adapting to evolving business requirements without redeployment.

FAQ

Data Quality FAQ

Common questions about data quality dimensions and how to operationalize them.

The key dimensions of data quality are Accuracy, Completeness, Consistency, Timeliness, Validity, Uniqueness, Integrity, Reliability, and Accessibility. These nine dimensions provide a structured framework for measuring whether data is fit for its intended purpose. The six most universally adopted (per IBM, DAMA, and the DQAF framework) are Accuracy, Completeness, Consistency, Timeliness, Validity, and Uniqueness — with Integrity, Reliability, and Accessibility adding important coverage for enterprise data governance contexts.
Accuracy measures whether a data value correctly reflects the real-world entity or fact it describes — whether it is actually right. Validity measures whether a data value conforms to a predefined format, data type, value range, or business rule — whether it follows the correct structure. A value can be valid but inaccurate (a correctly formatted but wrong date of birth), or accurate but invalid (a correct value stored in the wrong format). Both dimensions are necessary: validity without accuracy gives you well-formed garbage, and accuracy without validity gives you correct data that breaks downstream systems.
Data quality is measured using dimension-specific quantitative metrics: Accuracy as the percentage of values matching a trusted source. Completeness as the percentage of required fields populated (inverse of null rate). Uniqueness as the percentage of records without duplicates. Timeliness by data latency — the gap between event occurrence and availability. Validity as the percentage of values passing defined format and business rules. Consistency by the count of contradictions across systems. These metrics combine into a dimensional quality scorecard that can be tracked over time and broken down by dataset, pipeline stage, and business domain.
Data integrity in data quality specifically refers to the correctness of relationships between data elements — particularly referential integrity (foreign keys that point to valid parent records), relational constraints, and cross-system coherence. A dataset can score well on accuracy, completeness, and uniqueness but still have low integrity if the relationships between entities are broken. For example, an order table that references customer IDs from a customer table has a referential integrity violation if any customer ID in the orders table doesn't exist in the customer table. Integrity is especially critical in normalized relational databases and multi-system data pipelines.
For AI and machine learning, the most critical data quality dimensions are Accuracy (garbage in, garbage out — inaccurate training data produces inaccurate models), Completeness (missing values either require imputation or cause training failures), Consistency (inconsistent representations confuse models and reduce generalization), Uniqueness (duplicate training examples over-weight certain patterns and introduce bias), and Validity (invalid values — wrong types, out-of-range numbers — break training pipelines). Timeliness matters for models that require fresh training data to remain accurate on current patterns. Integrity matters when models need to join multiple tables for feature engineering.
DataKnobs provides three integrated layers for comprehensive data quality management. Kreate builds data pipelines with quality gates embedded at every transformation step — accuracy validation against reference sources, completeness checks on mandatory fields, validity rule enforcement, and deduplication for uniqueness. Kontrols provides the governance layer: dimensional quality policy enforcement, timeliness SLA monitoring with automated alerting, cross-system consistency reconciliation, audit trails for regulatory compliance, and quality scoring dashboards for data stewards. Knobs allows quality thresholds, alert sensitivity, and rule parameters to be tuned in production without pipeline redeployment — keeping the system calibrated as business requirements evolve and data patterns shift.

Start Improving Data Quality

Ready to measure and improve data quality across all nine dimensions?

DataKnobs helps data teams move from reactive fire-fighting to proactive, governed data quality management — with automated monitoring, enforcement, and governance built in from day one.

  • Free data quality assessment across your critical datasets
  • Dimensional quality scorecard for your key data domains
  • Automated quality pipeline pilot in 2–4 weeks

Talk to our data quality team

We'll assess your current data quality posture and help you build a dimensional monitoring framework tailored to your business.