Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • The fastest way to import dates variables from Excel with an appropriate variable name

    Dear Statalisters,

    I hope you are fine. I am new to importing datasets from Excel, and I'm sure that with a very long code I could find a solution to my problem but I'd like to learn fast ways to code which is why I am asking for your help. I basically have a dataset from Excel where each column is a day coded either in the 01/20/2011 format, or the 20Jan2011 one (I assume this is due to some mistake in the coding of the raw data). I'd like to import that dataset to Stata without changing anything in my raw dataset in the first place. Ideally, I'd like to have this format:
    Country Index_01Jan2020 Index_02Jan2020 Index_03Jan2020
    xxxx data data data
    Code:
    global path "C:/Users/xxxx/Desktop/xxxx/yyyy"
    global inp_str "$path/Inputs/Part II/Chap I"
    
    import excel "$inp_str/file.xlsx", first clear cellrange(B2:AKL264)
    * please note that the cellrange option is not the source of my problem. It is simply to avoid importing a line full of "Column 1", "Column 2", etc.
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str28 country_name byte(G H I J) str5 AH
    "Aruba"       . . . . "0"
    "Afghanistan" . . . . "0"
    "Angola"      . . . . "0"
    end
    label var country_name "country_name"
    label var G "  1/1/2020"
    label var H "  1/2/2020"
    label var I "  1/3/2020"
    label var J "  1/4/2020"
    label var AH "28Jan2020"
    As you can see, there are three problems in this dataset:

    1) First, dates are coded differently as you can see in the variables.
    2) For the dates that are coded in the "1/1/2020" format, there are these two annoying spaces that I can't seem to remove
    3) Variables names are letters because Stata cannot import variables names that start with a number.

    I would be grateful if you could find an efficient way to harmonize the dates name for every label in a way that looks like "Index_(the date in whatever format)" and then to give the variables the name of its label. Thanks a lot for the help!

    Regards,
    Adam
    Last edited by Adam Sadi; 26 Aug 2022, 08:11.

  • #2
    Sometimes one can find the answer by themself! Here's a code for beginners who might face the same issue:

    For problems number 1 and 2

    Code:
    foreach var of varlist * /* YOUR VARLIST HERE */ {
        local x : variable label `var'
        label var `var' "`=date("`=strtrim("`x'")'", "MDY", 2000)'"
        local x : variable label `var'
        local y = string(`x', "%td")
        local z = strproper("`y'")
        label var `var' "`z'"
    }
    For problem number 3:

    Code:
    foreach var of varlist G-AKL {
        local x : variable label `var'
        rename `var' index_`x'
    }

    Comment

    Working...
    X