Handbook on Data Quality Assessment Methods and Tools

[Pages:169]EUROPEAN COMMISSION

EUROSTAT

Handbook on Data Quality Assessment Methods and Tools

Mats Bergdahl, Manfred Ehling, Eva Elvers, Erika F?ldesi, Thomas K?rner, Andrea Kron, Peter Lohau?, Kornelia Mag,

Vera Morais, Anja Nimmergut, Hans Viggo S?b?, Ulrike Timm, Maria Jo?o Zilh?o

Manfred Ehling and Thomas K?rner (eds)

Cover design: Siri Boquist

Photo: Crestock

Handbook on Data Quality Assessment Methods and Tools

Mats Bergdahl, Manfred Ehling, Eva Elvers, Erika F?ldesi, Thomas K?rner, Andrea Kron, Peter Lohau?, Kornelia Mag,

Vera Morais, Anja Nimmergut, Hans Viggo S?b?, Ulrike Timm, Maria Jo?o Zilh?o

Manfred Ehling and Thomas K?rner (eds)

Contributors to the handbook: Manfred Ehling (chair), Federal Statistical Office Germany Thomas K?rner (chair till 6.2.2007), Federal Statistical Office Germany Mats Bergdahl, Statistics Sweden Eva Elvers, Statistics Sweden Erika F?ldesi, Hungarian Central Statistical Office Andrea Kron, Federal Statistical Office Germany Peter Lohau?, State Statistical Institute Berlin-Brandenburg Kornelia Mag, Hungarian Central Statistical Office Vera Morais, National Statistical Institute of Portugal Anja Nimmergut, Federal Statistical Office Germany Hans Viggo S?b?, Statistics Norway Katalin Sz?p, Hungarian Central Statistical Office Ulrike Timm, Federal Statistical Office Germany Maria Jo?o Zilh?o, National Statistical Institute of Portugal

Wiesbaden, 2007 Reproduction and free distribution, also of parts, for non-commercial purposes are permitted provided that the source is mentioned. All other rights reserved.

Contents

Contents

Contents ...................................................................................................................................3

1 Introduction .....................................................................................................................5 1.1 Scope of the Handbook .........................................................................................6 1.2 Aspects of Data Quality .........................................................................................9

2 Data Quality Assessment Methods and Tools ..............................................................13 2.1 Quality Reports and Indicators ............................................................................13 2.2 Measurement of Process Variables .....................................................................23 2.3 User Surveys .......................................................................................................29 2.4 Self-assessment and Auditing .............................................................................33

3 Labelling and Certification.............................................................................................41 3.1 Labelling ..............................................................................................................41 3.2 Certification to the International Standard on Market, Opinion and Social Research (ISO 20252:2006) ................................................................................44

4 Towards a Strategy for the Implementation of Data Quality Assessment.....................47 4.1 The Fundamental Package..................................................................................50 4.2 The Intermediate Package...................................................................................51 4.3 The Advanced Package.......................................................................................53 4.4 Recommendations...............................................................................................54

ANNEX A: General Framework of Data Quality Assessment ................................................55

ANNEX B: Examples..............................................................................................................71 Examples for Chapter 2.1: Quality Reports and Indicators ...........................................73 Examples for Chapter 2.2: Measurement of Process Variables....................................83 Examples for Chapter 2.3: User Surveys ......................................................................86 Examples for Chapter 2.4: Self-assessment and Auditing ............................................90 Examples for Chapter 3.1: Labelling ...........................................................................100 Examples for Chapter 3.2: Certification to the International Standard on market, opinion and social research (ISO 20252:2006) .................................................102

ANNEX C: Basic Quality Tools.............................................................................................109

ANNEX D: Glossary .............................................................................................................115

Abbreviations........................................................................................................................121

List of Figures and Tables ....................................................................................................125

References ...........................................................................................................................129

3

Introduction

1 Introduction

Production of high quality statistics depends on the assessment of data quality. Without a systematic assessment of data quality, the statistical office will risk to lose control of the various statistical processes such as data collection, editing or weighting. Doing without data quality assessment would result in assuming that the processes can not be further improved and that problems will always be detected without systematic analysis. At the same time, data quality assessment is a precondition for informing the users about the possible uses of the data, or which results could be published with or without a warning. Indeed, without good approaches for data quality assessment statistical institutes are working in the blind and can make no justified claim of being professional and of delivering quality in the first place.

Assessing data quality is therefore one of the core aspects of a statistical institute's work. Consequently, the European Statistics Code of Practice highlights the importance of data quality assessment in several instances. Its principles require an assessment of the various product quality components like relevance, accuracy (sampling and non-sampling errors), timeliness and punctuality, accessibility and clarity as well as comparability and coherence. The code at the same time requires systematic assessments of the processes, including the operations in place for data collection, editing, imputation and weighting as well as the dissemination of statistics.

Several efforts of implementation of data quality assessment methods have been undertaken in recent years. In succession of the work of Leadership Expert Group (LEG) on Quality some development projects have been carried out concerning assessment methods like selfassessment, auditing, user satisfaction surveys etc. (Karlberg and Probst 2004). Also a number of National Statistical Institutes (NSIs) have developed national approaches (see, e.g., Bergdahl and Lyberg 2004). Nevertheless and despite the importance of the topic being generally agreed, there is no coherent system for data quality assessment in the European Statistical System (ESS). The report on the ESS self-assessment against the European Statistics Code of Practice points in this direction and suggests that quality control and quality assurance in the production processes are not very well developed in most NSIs (Eurostat 2006c).

This Handbook on Data Quality Assessment Methods and Tools (DatQAM) aims at facilitating a systematic implementation of data quality assessment in the ESS. It presents the most important assessment methods: Quality reports, quality indicators, measurement of process variables, user surveys, self-assessment and auditing, as well as the approaches labelling and certification. The handbook provides a concise description of the data quality assessment methods currently in use. Furthermore, it gives recommendations on how these methods and tools should be implemented and how they should reasonably be combined: An efficient and cost-effective use of the methods requires that they are used in combination with each other. E.g. quality reports could be the basis for audits and user feedback. The handbook presents numerous successful examples of such combinations. Via the recommendations provided, the handbook at the same time aims at a further harmonisation of data quality assessment in the ESS and at a coherent implementation of the European Statistics Code of Practice.

The handbook is primarily targeted towards quality managers in the ESS. It shall enable them to introduce, systematise and improve the work carried out in the field of data quality management in the light of the experiences of colleagues from other statistical institutes within the ESS. The handbook shall also help to avoid overburdening the subject matter statisticians with assessment work and making data quality assessment an effective support for their work. Finally, the handbook should support top management in their managerial planning in the quality field.

After a short presentation of the basic quality components for products, processes and user perception, chapters 2 and 3 give concise descriptions of each of the methods. The presentation focuses on the practical implementation of the methods and, if applicable, their interlinkages among each other. The handbook also names up-to-date examples from statistical

5

Introduction

institutes (see ANNEX B). In order to facilitate the use of the handbook, the chapters presenting the methods are following a standardised structure covering the following items:

? Definition and objectives of the method(s)

? Description of the method(s)

? Experiences in statistical institutes

? Recommendations for implementation

? Interlinkages with other methods (where applicable)

? Recommended readings

Chapter 4 proposes a strategy for the implementation of the methods in different contexts. The handbook recommends a sequential implementation of the methods, identifying three packages with increasing level of ambition. But of course a particular NSI may apply methods and tools from different packages at the same time given the particular circumstances in which they function.

The number of pages of the handbook being heavily restricted, the handbook can not go very much into detail. Especially in order to be able to present more examples and to elaborate certain aspects in more detail, a comprehensive annex is provided together with the handbook. First it includes a background paper on the position of data quality assessment in the general framework of quality management (ANNEX A). ANNEX B presents good practice examples in some more detail. Furthermore, the annex provides a systematic presentation of basic quality tools (ANNEX C) and a glossary (ANNEX D).

1.1 Scope of the Handbook

Data quality assessment is an important part of the overall quality management system of a statistical agency (see ANNEX A for more details). However, its scope is limited to the statistical products and certain aspects of the processes leading to their production. Thus, the handbook does not cover areas like the support processes, management systems or leadership. Neither does it cover the institutional environment of statistics production.

Figure 1 shows the issues of DatQAM within the context of quality management. It also refers to the relevant principles in the European Statistics Code of Practice.

Figure 1: Scope of the handbook within the context of quality management

Elements of a quality management system

Corresponding principles from the European Statistics Code of Practice

User needs

Management systems & leadership

Support processes

Statistical products

Production processes

Institutional environment

Relevance, accuracy and reliability, timeliness and punctuality, coherence and comparability, accessibility and clarity

Sound methodology, appropriate statistical procedures, non-excessive burden on respondents, cost effectiveness

Professional independence, mandate for data collection, adequacy of resources, quality commitment, statistical confidentiality, impartiality and objectivity

6

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download