Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable Name as Label in Date Format

    I have the following data:

    Code:
    clear
    input float(id dtt20195 dtt20196 dtt20197 dtt20198 dtt20199 dtt20200 dtt20201 dtt20202 dtt20203 dtt20204 dtt20205 dtt20206 dtt20207 dtt20208 dtt20209 dtt20210 dtt20211 dtt20212 dtt20213 dtt20214 year)
    0 0 5  8 12 14 18 25 31 32 39 39 34 26 23 19 13 11  9  5 2 1
    1 5 8 10 11 17 21 25 29 32 34 34 30 28 25 21 19 17 15 10 5 1
    2 2 4  8 12 16 18 19 22 24 27 27 24 20 18 17 15 13 12  6 4 1
    end
    The variables denoted dtt are dates - that is, the number after dtt is a date in Stata format.

    What I would like to do is label each variable in a human-readable format using that number.

    I can format the relevant substring extracted from the variable name to display it correctly as a date, but I can't figure out how to save it as a correctly formatted date string. I can't format a local variable, otherwise I would do that.

    Code:
    foreach vl of varlist dtt* {
    
        local day = substr("`vl'", 4, .)
        
    * The following line will display the correctly formatted date, e.g., 17apr2015
        di %td `day'
    
    * Here I would like to label using the date
        label variable `vl'  " (correctly formatted day) "
    }
    I cannot do
    Code:
     label variable `vl' " `day' %td"
    , since that inserts the string "%td" after the number `day', though that is also something akin to what I want to do.

    Thanks for your help!

  • #2
    The data structure here is perverse for most Stata purposes and getting more informative variable labels is palliative at best. How would you propose to work with that structure? I would strongly advise a reshape.

    Code:
    reshape long dtt, i(id) j(date)  
    format date %td
    Last edited by Nick Cox; 30 Aug 2018, 07:58.

    Comment


    • #3
      I have a collection of units, each denoted by "ID".
      For each unit, I have a collection of counts for each day; these are the values in each dtt variable. E.g., unit 1 has a count of 5 on 5apr2015 and on the same day unit 2 has a count of 2.
      I don't think it is that strange to have this in wide form.
      I can solve my own problem by defining a new variable as a date label if the data is in long form.
      Do I need to propose some kind of purpose for this?
      If you want me to invent one, suppose I want to do:

      Code:
      graph bar dtt*, over(id)
      and I'd like the labels of the graph to be the labels of each variable. It would be nice to have human-readable labels for each "dtt" variable.
      I don't need to do that, but if you can tell me how to label the variables with a readable date with that problem in mind I would be grateful.

      Comment


      • #4
        The following should get you started in the right direction:

        Code:
        local x=string(20195, "%td")
        label var dtt20195 "`x'"
        Stata/MP 14.1 (64-bit x86-64)
        Revision 19 May 2016
        Win 8.1

        Comment


        • #5
          Awesome. Works perfectly. Thanks.

          In case anyone else looks at this thread in the future:

          Code:
          foreach vl of varlist dtt* {
          
              local day = substr("`vl'", 4, .)
              local xx = string(`day', "%td")
              label variable `vl' "`xx'"
          
          }
          Is one solution to this particular variable-labeling problem.

          Comment


          • #6
            I just said "for most Stata purposes" and think that's a good summary of not just my experience but more crucially of collective Stata experience. I've been reading about such problems in Stata forums for nearly 25 years. It's part of good practice here to give general advice, which sometimes means going beyond the immediate question.

            Your graph bar example is not convincing: getting a good graph based on wide structure not long is likely to be harder work. It's certainly not essential to have a wide structure for such a graph.

            Comment


            • #7
              Adding to Nick's comment in post #6, the data you have - a collection of "units" with observations of each on a set of "days" - constitute longitudinal data as described in the Stata Longitudinal-Data/Panel-Data Reference Manual PDF. If the commands documented therein describe the sort of analyses you hope to undertake, then the first step is reshaping your data to a long layout. Or perhaps, returning your data to a long layout, since on reflection what you have looks like the result of reshape wide dtt, i(id) j(date) .

              Among Statlist members, especially those like Nick with deep experience answering questions, there's a strong awareness of what's known as "the XY problem" - as described at https://en.wikipedia.org/wiki/XY_problem - where the user's stated objective masks a deeper objective better addressed in a different way. Questions like your stated question often - but certainly not always - show the user misunderstands how best to analyze longitudinal data in Stata, perhaps because of previous experience with other statistical software that works differently. We do users no favor by unquestioningly advancing them on a path that will not lead to their ultimate destination.
              Last edited by William Lisowski; 30 Aug 2018, 14:28. Reason: Substituted Wikipedia link for XY problem description for previous link to xyproblem.info

              Comment

              Working...
              X