Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Announcing labeldatasyntax: Stata module to produce syntax to label variables and values, given a data dictionary

    Dear Statalisters,

    I have just posted my program labeldatasyntax on SSC.

    It creates a syntax file (with syntax like that below) to label variables and/or values, given a data dictionary is provided in one of a few specific formats.

    label define regionlbl 1 "North" 2 "East" 3 "South" 4 "West"
    label define sexlbl 1 "Male" 2 "Female"
    label define yesnolbl 0 "No" 1 "Yes"

    label values sex sexlbl
    label values region regionlbl
    label values badears yesnolbl
    label values respprobs yesnolbl

    label variable sex "Gender"
    label variable region "Where the child lives"
    label variable age "Age of child in years"
    label variable dob "Date of birth"
    label variable badears "Has bad ears?"
    label variable respprobs "Has respiratory problems?"

    It is also hoped that the .csv files provided in this package could be useful for users in communicating to data providers what is a convenient format to receive a data dictionary associated with a dataset.

    Feedback welcome.

    Best wishes, Mark

  • #2
    Dear Mark,

    I am facing one issue in using -labeldatasyntax- on importing file from excel, which I have created using -codebookout- package. The -labeldatasyntax- is not writing the command label values for variables because -codebookout- package is not saving first value & label in the same line of variable name.

    Is there anyway to use labeldatasyntax in this situation if the value & label is on second line of the variable name or can you modify the code of -codebookout- to make an excel file which is compatible with -labeldatasyntax-?

    Thanks & best regards,
    Rasool Bux

    Comment


    • #3
      Hi Rasool

      The key in the below is:

      replace variable = variable[_n-1] if variable==""
      sort variable value


      clear
      sysuse auto
      codebookout "N:\auto codebook.xls"
      import excel "N:\auto codebook.xls", sheet("Sheet1") firstrow case(lower) clear
      rename variablename variable
      rename variablelabel description
      rename answercode value
      rename answerlabel label
      replace variable = variable[_n-1] if variable==""
      sort variable value
      labeldatasyntax, saving("N:\rblabel.do") replace

      Comment


      • #4
        I have just added an .xlsx ancillary file to the package: - "Guide. How to share data from a project in Excel v2.xlsx" which has been renamed to "labeldatasyntax.xlsx"

        Overview
        This Excel workbook is designed to explain how a statistician would like to receive data from a project (if data is to be sent in Excel).
        On this sheet, we provide some broad advice.
        On the other sheets (two* Data sheets, corresponding Data Dictionary sheets, and one Data notes sheet), we show a good example of sharing data from a project, and give some tips.
        *this number will vary according to the complexity of the project / data.
        While Tips appear underneath the data on these Data and Data Dictionary sheets, we do not want to see anything but data on your Data and Data Dictionary sheets.
        Generic advice is given, but you could ask us for advice/review of your proposed method of storing data before you begin entering data, e.g. share with us your Excel template.
        Data should be screened for errors before being submitted. It is not the statistician's job to screen and clean data.
        Please do NOT password protect Excel files. Otherwise we will have to copy and paste each sheet into a second Excel file (or save as .csv files) before our statistical software can read in your data.
        This template has been created by Mark Chatfield and others in the Statistics Unit at QIMR Berghofer Medical Research Institute, Oct 2017. Minimal changes made Apr 2018 at the University of Queensland.

        cd "C:\ado"
        net install labeldatasyntax.pkg
        net get labeldatasyntax.pkg
        import excel "labeldatasyntax.xlsx", sheet("Data Dictionary1") cellrange(A2:D27) firstrow clear
        labeldatasyntax, saving("ex0_label.do")
        import excel "labeldatasyntax.xlsx", sheet("Data1") cellrange(B2:K8) firstrow clear
        describe
        do "ex0_label.do"
        describe
        browse

        Comment

        Working...
        X