New Stata module: SCANDATA published - Module to ease data preparation

Malte Kaukal

Join Date: May 2015

Posts: 8
#1

New Stata module: SCANDATA published - Module to ease data preparation

24 May 2016, 08:49

Dear all,

thanks to Kit Baum my new Stata module "SCANDATA" is now available on the SSC.

The module is in order to ease the process of data preparation by supporting several steps of identifiying variables being in need of preparation. Thus, SCANDATA scans a dataset and reports variables fitting several criteria. This criteria are:

- certain length of variable names, labels and value labels (all have default settings but can be adjustet to individual needs)

- having unlabeled values although a value label is given

- variable names containing uppercase characters (found variables can be transformed in lowercase characters)

- variable and/or value labels containing mutated vowels (found variables can be corrected as well, e.g. "ä" becomes "ae")

- meeting one of three criteria hinting at a odd distribution of a variable (e.g. containing wild codes, much less values as of the value label suggested or many observations being on only a few values). Note that this approach takes a dataset perspective and not a respondent perspective as classic measurements of data quality do.

Please be aware that the module can't replace a detailed inspection of all variables to make sure that everything is covered. But it can be used to get a quick overview or to find observations being slipt through the data preparater's net.

For more details please see the help file or contact me. Any comments especially regarding the identifiaction of odd distributions are very welcomed.

All the best,

Malte
Tags: None

Announcement

New Stata module: SCANDATA published - Module to ease data preparation