Data corrections


Data corrections

A broad range of standardised quality assurance measures guarantees the high quality of the RDC data supply. Should a revision of a data material be necessary nonetheless, data users are informed according to a standardised procedure.

The quality assuring measures that are realised by the Research Data Centres of the Statistical Offices of the Federation and the Federal States before a data material is published include, among others, checks according to the dual control principle and a close collaboration with the specialised departments of the according statistics. However, because of the great number of accessible data products and the on-going extension of the data supply, data may still have to be revised according to schedule or because of a fault in the data. It is our aspiration to inform all data users timely, broadly, comprehensively and in detail about necessary changes in a data material. For this, the RDC have implemented a standardised procedure.

Reasons for potentially necessary corrections

At the specialised departments:

A high up-to-dateness of our data offer is important to us. Consequently, the staff of the RDC often start to process the data for on-site use before the specialised department can provide all characteristics (e.g. single modules or weights). Already available data products are complemented with the missing variables as soon as they are delivered.

During the data collection and data processing at the specialised departments, errors may occur that only become visible during the later analysis for publications of the statistical offices or through RDC use. If possible, those are corrected. In any case, they are documented.

At the RDC:

The standard data supply of the RDC is continually extended with new products. Therefore, the data material is checked, made plausible and labelled. Sometimes, additional characteristics are deduced. During these complex procedures errors may occur that have to be corrected subsequently.

For some uses of data additional specific processes like case selection or data merge are necessary. If the error occurred during one of these procedures then usually only single projects are affected. The contact persons are then contacted quickly and transparently.

Information of data users

The RDC and the specialised departments promptly follow up on references to data errors. As soon as a suspicion hardens all projects that are working with the according data are informed. Using e-mail, we have a quick access to all contact persons. Thus, the output and publication of faulty results can be prevented. The data users are being informed consistently, continually and in detail about the process of causal investigation and possible corrections.

Correction and documentation

All error and correction processes are documented. All content-related changes are immediately visible through the versioning of every product with a Digital Object Identifier (DOI) [Link:]. This concerns, for example, labels with a changed content or all changes related to the variables and cases. The product-specific metadata reports and/or codebooks show what kind of change has been made. Errors that cannot be corrected because they, for example, originated in the process of data collection are also documented in the metadata and thus are publicly accessible.

Data provision

The supervising RDC location provides all affected users with corrected data and detailed information as soon as possible. Version changes are made visible in the name of the dataset. Cost-neutral extensions of the period of use are possible if the flawed data have caused considerable delays of the user contracts.


What can I as a data user do if I think that a dataset is faulty?

Please contact your supervising RDC location. The local staff is glad to help you. If necessary, they will consult the RDC location that is specialised on the concerned statistic so they can take the required measures.

Where can I find information on data corrections?

A new version number is assigned whenever a product is changed and the changes have an effect on the content. The version number can be distinguished using the Digital Object Identifier (DOI). There is an RDC subsite for every DOI. Earlier versions can be found on the RDC homepage. Changes are documented in the metadata reports of the according product.

I have already received output based on faulty data. What can I do?

Please check if your analyses are affected by the changes (e.g. if concerned variables were used). If necessary, we gladly assist you in re-running your program code using the corrected data.

Analyses had to be re-done because of faulty data. Because of that the end of my project is delayed. Do I have to request a fee-based extension?

Please contact your supervising RDC location. We gladly offer you a free extension of your project to compensate for delays that were not caused by you.