Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Create local macros with value labels as local names

    Hello,

    In the data extract below, cell_x is a variable with value labels. I want to create local macros (one for each observation in the data) such that the name of the local macros are the value labels of cell_x and the value of the local macros are the value of the unemp variable corresponding to that observation.

    E.g., I would like local macros as follows but I want to do this in a loop.

    Code:
    local x_e1-x_n1-x_l1 = .0664063
    local x_e1-x_n1-x_l2 = .1428571
    local x_e1-x_n2-x_l1 = .04
    local x_e1-x_n2-x_l2 = .1794872
    local x_e1-x_n2-x_l3 = .125
    local x_e2-x_n1-x_l1 = .031746  
    .
    .
    .
    Any advice on how I can achieve this?

    Thanks!

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(cell_x unemp) double count
     1  .06640625 256
     2  .14285715  14
     4        .04  25
     5   .1794872  39
     6       .125  16
    10 .031746034 630
    11 .033898305 177
    13   .0591716 169
    14 .024904214 522
    15 .015384615 195
    17  .03517588 199
    18 .028688524 488
    19          0  14
    22  .14285715   7
    23          0  64
    24          0  53
    26   .0754717  53
    27 .025787966 349
    end
    label values cell_x cell_x
    label def cell_x 1 "x_e1-x_n1-x_l1", modify
    label def cell_x 2 "x_e1-x_n1-x_l2", modify
    label def cell_x 4 "x_e1-x_n2-x_l1", modify
    label def cell_x 5 "x_e1-x_n2-x_l2", modify
    label def cell_x 6 "x_e1-x_n2-x_l3", modify
    label def cell_x 10 "x_e2-x_n1-x_l1", modify
    label def cell_x 11 "x_e2-x_n1-x_l2", modify
    label def cell_x 13 "x_e2-x_n2-x_l1", modify
    label def cell_x 14 "x_e2-x_n2-x_l2", modify
    label def cell_x 15 "x_e2-x_n2-x_l3", modify
    label def cell_x 17 "x_e2-x_n3-x_l2", modify
    label def cell_x 18 "x_e2-x_n3-x_l3", modify
    label def cell_x 19 "x_e3-x_n1-x_l1", modify
    label def cell_x 22 "x_e3-x_n2-x_l1", modify
    label def cell_x 23 "x_e3-x_n2-x_l2", modify
    label def cell_x 24 "x_e3-x_n2-x_l3", modify
    label def cell_x 26 "x_e3-x_n3-x_l2", modify
    label def cell_x 27 "x_e3-x_n3-x_l3", modify

  • #2
    So essentially you want to replicate the information that you already have in the dataset in a series of local macros? That seems odd. May I ask why you want to do that? Probably there are better ways of achieving your ultimate goal.

    Comment


    • #3
      Hi Daniel, thanks for your response. I am looping over a bunch of states and computing various statistics of households in these states. I save each statistic as a local macro and then eventually collect all these local macros in a matrix. The data sample that I shared has unemployment rates of different household groups (HH groups are in cell_x) in a given country. I want to save these in a local macro so that I can add these to my matrix of statistics later on in my code.

      Comment


      • #4
        I agree with daniel klein and his question hasn't yet been answered fully, as "computing various statistics" could mean anything from utterly standard to fairly unusual.

        If you are compiling a matrix, in general, it is simplest and best to put saved numerical results in as soon as you get them. There is no advantage, indeed some indirection and some possible loss of precision, in storing them in local macros temporarily.

        It's still true that some other method of collating results is likely to be simpler than what you are doing. such as table, statsby, collapse, or tabstat. . Your unemployment rates sound like means to me.

        Comment


        • #5
          Hi Daniel/Nick:

          The other statics range from estimates from regressions to means, SDs, correlations etc. Some statics are by a group var within the state (such as does of workers or types of jobs).

          The unemployment rates are means taken for each group (defined by cell_x_ for each state:

          Code:
          preserve
          collapse unemp (sum) count, by(cell_x) 
          restore
          I prefer storing statistics in local macros first and then saving into a matrix for the following reason:
          1. For some states there are no observations in some groups defined by cell_x. So I need to keep track of which cell_x group a statistic corresponds to. I am currently doing this using local macros.
          2. I am assigning each row in my matrix a row name (code below) and this is convenient to do if each statistic has a corresponding local macro so I can just assign the local macro name as the row name.
          Code:
          mat rownames mat
          Open to suggestions if you still think I should not use local macros and have any suggestions on alternative approaches.

          Comment


          • #6
            Thanks for the further detail, which at least makes clear that your overall task is quite complicated. But my impression remains that local macros have no obvious advantages here and indeed can be avoided altogether. I will try to make this clear with a silly example.

            No experienced user would (or rather should) try to create this matrix this way. The example is deliberately simple enough to make this point but also realistic as

            * posing a complication that must be matched in code, namely that values for 0 and 1 need to be placed in columns 1 and 2

            * looping over cross-combinations of variables that don't exist in the data

            Code:
            sysuse auto, clear 
            
            matrix counts = J(5, 2, .)
            
            forval i = 1/5 { 
                forval j = 0/1 { 
                    quietly count if foreign == `j' & rep78 == `i'
                    matrix counts[`i', `j' + 1] = r(N)
                }
            }
            
            matrix colnames counts = 0 1 
            matrix rownames counts = 1 2 3 4 5 
            
            matrix list counts 
            
            counts[5,2]
                0   1
            1   2   0
            2   8   0
            3  27   3
            4   9   9
            5   2   9
            In short, the principle is to populate a matrix directly as soon as you get a result to put inside. No need for local macros!

            Counting is simple as observations that don't exist correspond to a count result of 0. But the logic is not much different if the corresponding statistic is missing.







            Comment


            • #7
              Nick has pointed to potential problems with local macros. I will add one obvious problem: value labels need not conform to naming conventions in Stata. For example, x_e1-x_n1-x_l1 is not a valid (local macro) name. In this unfortunate case, Stata will not complain (where I think it should) and store "-x_n1-x_l1" and anything following into the local macro x_e1. Also, for statistics that are not calculated by cell_x, and assuming that cell_x has repeated values (otherwise by would not be needed), you would end up assigning different values (strings) to the same local macro name. So in general, I do not think that your approach is easily feasible.

              Nick has also pointed to commands, some of which you already use. From what I understand, I would probably collect the statistics in frames or (temporary) datasets, then merge them together and create the matrix from the combined dataset. But it remains hard to suggest specific code for generic problem descriptions.
              Last edited by daniel klein; 22 Aug 2022, 04:45.

              Comment


              • #8
                Stata has a command devised specifically to accumulate results from analysis. You can see
                Code:
                help post
                for documentation of this command, which has been around for quite some time.

                In Stata 17 the frame command was introduced, and documented in
                Code:
                help frame
                including the frame post subcommand, an improved version of the post command, which I demonstrate in the example below.
                Code:
                sysuse auto, clear
                
                local ylist weight length mpg
                
                frame create results str20 y str20 stat float value
                
                foreach y of local ylist {
                    summarize `y'
                    frame post results ("`y'") ("mean") (r(mean))
                    regress `y' price
                    frame post results ("`y'") ("beta") (e(b)[1,1])
                }
                
                frame results: list, clean noobs
                Code:
                . frame results: list, clean noobs
                
                         y   stat       value  
                    weight   mean    3019.459  
                    weight   beta    .1419244  
                    length   mean    187.9324  
                    length   beta      .00326  
                       mpg   mean     21.2973  
                       mpg   beta   -.0009192

                Comment


                • #9
                  Thanks so much, William! my code is so much better now after using frame create/post.

                  Comment

                  Working...
                  X