Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping long for variables with two indexes

    Dear Statalisters,

    I am having a little issue with the -reshape long- command as I'm discovering it for the first time and I'm not quite sure of how it works exactly. To put it briefly, I ran a regression with an interaction term between my first variable (var1, 8 values) and my second variable (var2, 7 values) and I generated 8*7 = 56 coefficients named coef_`i'_`j', i being the value of var1 and j being the value of var2. Now, I'd like to reshape long my variables coef, that is, I'd like to have one variable coef in a way that its modalities align with the appropriate var1 variable and the appropriate var2 variable. Here's a data example, even if I cannot show you the full picture because of memory restrictions:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float var1 byte var2 float(coef_3_1 coef_4_5 coef_2_7)
    1 5 .2391724 -.037122227 .021247435
    1 6 .2391724 -.037122227 .021247435
    1 5 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    1 8 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    1 9 .2391724 -.037122227 .021247435
    1 9 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    1 5 .2391724 -.037122227 .021247435
    1 5 .2391724 -.037122227 .021247435
    1 3 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    1 9 .2391724 -.037122227 .021247435
    1 8 .2391724 -.037122227 .021247435
    1 5 .2391724 -.037122227 .021247435
    1 9 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    1 1 .2391724 -.037122227 .021247435
    1 5 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    1 5 .2391724 -.037122227 .021247435
    1 2 .2391724 -.037122227 .021247435
    1 7 .2391724 -.037122227 .021247435
    end

    Long story short, I'd like my data to look like this:
    var1 var2 coef
    1 1 coef_1_1
    1 2 coef_1_2
    1 3 coef_1_3
    ... ... ...
    2 1 coef_2_1
    ... ... ...
    4 1 coef_4_1
    ... ... ...
    8 7 coef_8_7
    If my coef variable had only one index, namely, coef_`i', I'd know how to do, but I can't seem to find the appopropriate code for a combination of two variables. Any help would be much appreciated!

    Regards,

    Hugo

  • #2
    is this useful,
    Code:
    duplicates drop
    rename coef* coefcoef*
    reshape long coef, i(var1 var2) s
    keep if var1 == real(substr(_j,6,1)) & var2 == real(substr(_j,8,1))
    ?

    Comment


    • #3
      Dear Øyvind,

      Thank you for your reply. There is actually something I forgot to mention: var1 stands for country numbers. I know I wrote that it can take 8 different values but it can possibly be more depending on how my assignment goes. Your code worked perfectly well for 8 countries, but I tried on another sample of 20 countries and there is a wrong sorting of the coef variable. Please look at the data:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input float var1 byte var2 str9 _j float coef
      1 1 "coef_10_1"   .22906532
      1 1 "coef_11_1"    .4191665
      1 1 "coef_12_1"    .8610222
      1 1 "coef_13_1"    .3474927
      1 1 "coef_14_1"   .13066871
      1 1 "coef_15_1"    .4573288
      1 1 "coef_16_1"    .2507517
      1 1 "coef_17_1"   .11583392
      1 1 "coef_18_1"   .46527685
      1 1 "coef_19_1"   .09675107
      1 2 "coef_10_2"    .2396189
      1 2 "coef_11_2"   .35154355
      1 2 "coef_12_2"    .4471683
      1 2 "coef_13_2"    .3643037
      1 2 "coef_14_2"    .1411443
      1 2 "coef_15_2"   .20535725
      1 2 "coef_16_2"    .5040531
      1 2 "coef_17_2"   .11987325
      1 2 "coef_18_2" -.017533852
      1 2 "coef_19_2"    .2263153
      1 3 "coef_10_3"    .2470427
      1 3 "coef_11_3"   .15849788
      1 3 "coef_12_3"    .3568971
      1 3 "coef_13_3"     .230723
      1 3 "coef_14_3"   .04476113
      end
      As you can see, Stata puts coef_1x_1 (x = 0,1,2...9) before coef_1_1. Again your code is not the problem: I forgot to mention that var1 could change, apologies. Could you please indicate how your code should be modified for a number of var1 values greater than 10 (22, to be precise!)

      Regards,

      Hugo
      Last edited by Hugo Denis; 27 May 2022, 03:58.

      Comment


      • #4
        try moss (SSC),
        Code:
        duplicates drop
        rename coef* coefcoef*
        reshape long coef, i(var1 var2) s
        moss _j, match("([0-9]+)") regex
        keep if var1 == real(_match1) & var2 == real(_match2)

        Comment


        • #5
          Dear Øyvind,

          Thank you so much for your reply. I could figure it out by simply adding leading zeros before generating the coef_i_j variables. The problem is solved!

          Comment

          Working...
          X