Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping Data yearly

    Hello everyone I want to reshape my data so I can have one Collumn for each one of my 4 variables in Variables variable(TRESG, ENSCOR, CGSCOR, SOSCOR) I want to keep Company Code as it is and I want to make it so I can have the years on one collumn as well. I tried by reshape long Q, i( CompanyCode Variables) j(Year) but it says variable id does not uniquely identify the observations. Variable id is CompanyCode and Variables right? Why does it give me this message? and how do i solve this problem. Thanks in advance



    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str6(CompanyCode Variables) float(Q2004 Q2005 Q2006 Q2007 Q2008 Q2009 Q2010 Q2011 Q2012 Q2013 Q2014 Q2015 Q2016 Q2017 Q2018 Q2019 Q2020 Q2021 Q2022)
    "997863" "TRESG"  17.86  16.4 18.73 31.19 45.45 62.79 61.86 60.46 59.82 57.08 51.95  61.1 56.55 57.74 61.58 66.94 66.83 68.59 69.82
    "997863" "ENSCOR"  7.52 11.68 11.03  40.2 41.08 71.52 74.44  79.5 80.75 79.04 70.36 72.67 70.85 74.22 73.67 73.53 69.81 72.67 74.69
    "997863" "CGSCOR" 26.86 17.17  25.5  23.6 62.74 63.71 57.86 33.52 28.63  38.1 31.26 37.79 32.53 30.43 55.22 69.08 73.97 75.91 76.24
    "997863" "SOSCOR" 20.47 20.15 20.63 28.72 36.18 54.02 53.32  63.6 64.48 51.42 50.89 68.36 61.83 63.53  55.3 59.22 58.58  59.2 60.39
    "679821" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 27.27     .     .
    "679821" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 21.31     .     .
    "679821" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  49.3     .     .
    "679821" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 17.93     .     .
    "2800K7" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 15.07  18.7 17.16
    "2800K7" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 11.97 14.38 14.67
    "2800K7" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 31.23 40.45 37.28
    "2800K7" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  4.82  4.99  3.67
    "31938W" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     . 41.09 57.07 67.35 70.74  72.8
    "31938W" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 37.21 63.48 66.25  75.1 83.45
    "31938W" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 31.86 51.06 73.32 72.94 64.96
    "31938W" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .  48.9 56.47 64.59  66.7 70.56
    "749352" "TRESG"      .     .     .     .     .     .  7.18  7.46 11.21 11.79 12.36 17.96 25.61 54.14 57.93 60.83 64.25 67.05 65.03
    "749352" "ENSCOR"     .     .     .     .     .     .  2.38   .28  1.83  2.45  3.95 12.28 21.34 54.99 56.03 58.01 60.87 68.79  68.1
    "749352" "CGSCOR"     .     .     .     .     .     .  3.57 11.09 17.17 17.26 14.15 10.63 24.12 25.33 35.88 39.58  49.1 56.45 50.87
    "749352" "SOSCOR"     .     .     .     .     .     . 14.33 11.27 15.26 16.18 18.69 28.77 30.65 75.32  76.5 79.63  78.9 73.54 73.01
    "15437R" "TRESG"      .     .     .     . 36.47    51 42.44  39.4 55.14 54.62 60.96 60.64 67.48 72.04  69.7  73.4 73.54 71.74     .
    "15437R" "ENSCOR"     .     .     .     . 19.14  17.8 20.61 19.71 58.48 55.77  63.6 64.72 72.02 37.12 36.07 32.61  35.5 43.27     .
    "15437R" "CGSCOR"     .     .     .     . 31.86 69.13 54.38 51.05 55.19 65.73 58.93 61.85 72.24  77.7 81.55 89.43 92.02  83.9     .
    "15437R" "SOSCOR"     .     .     .     . 50.13 52.27 45.07  41.5 61.67  54.6 67.79 66.59 72.48 78.07 70.86  73.6 71.18 71.18     .
    "88676H" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 28.32 34.13 41.27 46.12
    "88676H" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .   .55   .53 16.19 31.84
    "88676H" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 44.98 72.09 76.72 68.18
    "88676H" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 41.35 39.46 41.07 44.92
    "69212W" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  8.26 14.96 18.56  15.3
    "69212W" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     0     0     0     0
    "69212W" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 16.69 29.62 38.22 33.62
    "69212W" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  6.67 12.49  14.5 10.55
    "922924" "TRESG"  64.56 89.18 86.84 89.91 90.47 91.13 89.24 93.42 91.44 89.72 92.89  88.9 87.22 89.62 92.74 93.79 94.06 94.36 91.81
    "922924" "ENSCOR" 71.16 72.04 66.73 88.55 95.08 90.27  91.6 93.77 92.51 92.46 98.12 97.59 97.71 98.05 97.85 96.49  96.6  97.4 97.48
    "922924" "CGSCOR" 80.92 98.12 98.12 95.16 94.68 94.36 88.31 95.49 90.72 87.95 94.03 84.52 67.99 73.13  83.7 90.01 91.47  90.8 88.54
    "922924" "SOSCOR" 46.05 98.04 96.65 87.15 83.05 89.46 87.79 91.52 91.01 88.57 87.24 84.29 92.29 94.48 94.97  94.2 93.71 94.29 89.13
    "692458" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     . 18.96 25.25 29.77 30.92 33.92
    "692458" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .  6.95  9.18  9.35  9.24  11.1
    "692458" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 11.72 26.87 34.41 34.39 41.61
    "692458" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 31.79 29.05 31.53  34.5 32.91
    "928896" "TRESG"      .     .     .     .     .     .     .     .     .     .     . 20.47 32.12 28.27 29.01 29.15 28.65 32.84 26.22
    "928896" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     0 13.64 12.28    10 11.11 11.38 21.97     0
    "928896" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     . 12.74 18.94 11.35 16.01 15.74 21.26 29.67 22.53
    "928896" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     . 36.47 54.96 55.49 51.75 52.18 43.33 40.07 38.31
    "54156J" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  8.94 21.39 28.63
    "54156J" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     0     0     0
    "54156J" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 14.08 45.36 59.02
    "54156J" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  9.84 10.72 16.22
    "8721AT" "TRESG"      .     .     .     .     .     .     .     .     .     .     . 65.72  65.1 71.36 71.27 64.81 71.02 72.75 73.59
    "8721AT" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     . 85.17  86.3 80.58 88.21 89.66 90.23 92.81 91.62
    "8721AT" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     . 42.61 48.41 64.38 72.62  52.9    72 74.94 78.16
    "8721AT" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     . 78.01 72.59 73.75 65.38 66.24 64.73 65.33 65.04
    "87677L" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 18.27 23.71 22.79
    "87677L" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 18.01 21.21 20.83
    "87677L" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 19.54 29.82 29.61
    "87677L" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 17.06 19.63 17.07
    "2580QL" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 49.56
    "2580QL" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 69.23
    "2580QL" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 44.54
    "2580QL" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 37.78
    "36228U" "TRESG"   32.9 26.12  38.2  57.3 81.45  85.8 79.06 73.49 74.94 73.23 74.82 74.42 77.48 73.16 80.19 82.75 84.16 84.06 86.19
    "36228U" "ENSCOR" 37.06 33.41 62.23 51.16  90.8 88.71 88.66 87.07 86.92  86.5 87.98 87.69 87.89 63.38 48.44  50.3 48.84 54.41 54.23
    "36228U" "CGSCOR" 39.35 24.61 38.31 45.83  85.6 93.12 95.26  90.5 90.71 88.21 87.05 81.73 84.48 67.78 87.56 91.14 93.93  91.7 92.71
    "36228U" "SOSCOR" 33.19 34.67 39.01 80.84 78.47 81.09 63.02 54.78 57.99 57.01 61.84 66.84 71.23 83.01 82.75 84.33 85.14 85.56 89.83
    "28239C" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     . 23.03 23.21 30.09 31.63  27.3
    "28239C" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .  5.42  5.82 31.69 39.02 15.88
    "28239C" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 48.03 50.69 43.71 41.16 55.33
    "28239C" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 18.27 15.53  13.6 12.28 11.65
    "9084YJ" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 18.25 27.22 27.11 28.11
    "9084YJ" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 31.75  32.6 31.47 37.21
    "9084YJ" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  9.29 16.05 15.89 16.56
    "9084YJ" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 17.88 33.01 33.51 32.15
    "75626G" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 38.55 39.43 47.73    35
    "75626G" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 24.67 30.75  49.1  22.1
    "75626G" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 32.85 35.31 33.64 34.26
    "75626G" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 57.62 51.97 62.04 47.69
    "2812RG" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 17.37 16.95  18.9
    "2812RG" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     0     0     0
    "2812RG" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 46.72 46.14 47.66
    "2812RG" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  1.23   .78  3.76
    "2580D4" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  20.2 21.15 35.39
    "2580D4" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     0     0 11.37
    "2580D4" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 28.99 32.86 50.13
    "2580D4" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     . 17.03 14.92 26.63
    "257943" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     . 17.54 19.45 30.91 39.59 40.42
    "257943" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     0     0 45.44 46.29 54.79
    "257943" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 10.01  13.8 24.88 40.77 40.12
    "257943" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     . 32.38 32.79 32.86 35.89 35.76
    "741112" "TRESG"  63.96 73.51 80.79 88.66 78.32 80.74 82.52 85.03 78.94 77.42 78.68 77.94    83 84.15 89.66 87.89 88.18 88.29 88.23
    "741112" "ENSCOR" 45.63 76.86 71.52 93.08 84.43 84.72 84.03 88.59 88.73 81.26 83.53 85.78  83.4 89.85 92.72 92.49 91.27 91.47 90.78
    "741112" "CGSCOR" 81.05 83.49  79.5 77.71  59.5 64.02 66.72 69.17 52.56 60.96 58.97    56 68.83 64.78 78.12 67.43 70.35 69.96 71.26
    "741112" "SOSCOR" 70.17 62.88 91.14 92.12 85.77 88.83 92.46 92.92 88.15 85.47 88.06 86.01 92.97 92.43 94.91 98.08 97.98 98.35 97.96
    "9224GN" "TRESG"      .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  2.92   3.9 12.32
    "9224GN" "ENSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     0     0     0
    "9224GN" "CGSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  3.19  7.14  8.15
    "9224GN" "SOSCOR"     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .     .  6.17  5.59 30.94
    "929363" "TRESG"   52.8 65.81 62.12 59.56 53.64  51.9 61.22 69.42 64.25 69.02 68.73 75.16 75.08 79.48 77.96 81.13 82.54 82.85 81.99
    "929363" "ENSCOR" 81.48 87.99  92.3 83.52  80.9 79.74    83 87.08 87.23 83.69 91.02 90.06 88.18 78.92 74.51 75.42 79.96 77.91 75.21
    "929363" "CGSCOR" 47.77 37.46 36.14 47.39 39.26 23.17  45.2  62.8 52.65 71.97 52.74 42.99 45.05 66.44 69.89 71.47  76.2 78.53 77.48
    "929363" "SOSCOR" 38.95  71.4 61.33 53.29 46.88 54.34 58.83 63.24  58.2 58.25 66.01 87.67 87.23 88.51 85.41 90.99 88.32  88.7 89.06
    end

  • #2
    In addittion I have a question If I want to merge this dataset by string(CompanyCode) with another dataset will stata let me do it?

    Comment


    • #3
      I cannot replicate your problem using your example data in my setup. -reshape long Q, i(CompanyCode Variables) j(Year)- works without error messages, and is the right first step towards what you are looking for. And the complete code for your task is:
      Code:
      reshape long Q, i(CompanyCode Variables) j(Year)
      reshape wide Q, i(CompanyCode Year) j(Variables) string
      rename Q* *
      That said, even though it is not happening in your example data, I have never known Stata to be wrong when it tells you that some variable(s) it expects will uniquely identify observations fails to do so. So I am quite confident that somewhere in your full data set there is one or more combinations of CompanyCode and Variables that recurs. To find it:
      Code:
      duplicates tag CompanyCode Variables, gen(flag)
      browse if flag
      The next question is how to resolve the problem. This situation may represent an error in data management leading to the creation of this data set. The question is how the surplus observations got there, and what to do about them. There are several possibilities.
      1. The surplus observations are exact duplicates of each other in all variables, but they are supposed to be there. This could arise, for example, if some companies have subdivisions that contributed separate observations to the data set. The surplus observations correspond to different subdivisions, but the subdivision variable is not present, and it doesn't matter because the variables that are present are all defined on the company level. In that case, you can happily run -duplicates drop- and then proceed to -reshape-.
      2. The surplus observations are exact duplicates of each other in all variables but they are not supposed to be there. It is simple enough, in this case, to eliminate them with the -duplicates drop- command. But before doing that, you should review your data management that led to the production of this data set. Proper data management would not have produced these surplus observations, so something went wrong somewhere along the line. And where one thing goes wrong, other errors may lurk. So it is best to try to find the source of the problem, and fix that and re-generate the data rather than just eliminating the surplus observations and being left with a data set that will pass the -reshape- command's tests but might be wrong data. If, however, the data set was created by others and you have no way of reviewing its data mangement or getting the purveyor to do so, then -duplicates drop- is your best bet.
      3. The surplus observations disagree on some variables other than CompanyCode and Year. Now you know for certain that the data are wrong because they are internally contradictory. Here you have no choice but to resolve the inconsistencies. How to do that depends on what produced them, and you have to do a detailed review of how this data set came to be and fix those problems. If the data set was provided by somebody else, you should make them aware that their data has internal contradictions and insist that they resolve them (or provide you with enough information that you can do so.)

      Comment


      • #4
        Re #2, the answer is that it depends.

        The data set that the code in #3 will create will have observations uniquely identified by the combination of CompanyCode and Year. First, to state the obvious, the other data set must contain a variable named CompanyCode, and its system of codes for companies must be the same as that of the first data set in order to assure that appropriate observations from them are paired. Also important is the organization of the data.

        If the data set you wish to merge it with has that same organization, then you will want to -merge 1:1 CompanyCode Year using other_data_set-, and that will work just fine.

        If the other data set is not longitudinal but has only one observation per company, then you will want to -merge m:1 CompanyCode using other_data_set-. And this will also work just fine.

        Where you will hit a snag is if the other data set has multiple observations per CompanyCode but those multiple observations are not identifiable by a Year variable but represent something else (geographic regions, subdivisions, time periods other than Year, etc.) In the case of the identifying additional variable(s) being non-chronological, these data sets cannot be -merged- because there is no way to know which observation for a given CompanyCode in the second data set should be paired with which Year for that ComopanyCode in the first data set. It might be the case that what you need is to pair each of the CompanyCode's observations in the second data set with each Year of that CompanyCode's observations in the first. If that is the case, you must use the -joinby- command. If you are not familiar with it, see -help joinby- to learn its syntax, which is somewhat similar to -merge-'s.

        If, however, the identifying additional variable(s) in the second data set are chronological, but just not years, then what you need to do is harmonize the time in the two data sets. Assuming that the time periods in the second data set are shorter than years (e.g. months, or quarters, or days) then you can create a new Year variable in the second data set by extracting the year from the shorter time variable (see -help yofd()-, -help dofm()-, and -help dofq()- if you don't know how to do this). Then you can -merge 1:m CompanyCode Year using other_data_set-. Note that this process of extracting the Year from a date variable that is denominated in a smaller time unit is predicated on the date variable's being a true Stata internal format numeric date variable. If what you have is a string variable that looks like "Q3 2006" or "Sep 2021", you will first need to convert that to a Stata internal format date variable. (If this is your situation and you need assistance doing that, post back with example data. If you are going to work with longitudinal (aka panel) data on a regular basis you really need to learn generally about Stata internal format date variables and the functions that create them, and transform among them, as well as how they can be displayed in a way that makes them readable by humans--but that is going to take you some time, and if you are pressed for quick results on this project, you can get help here to handle the immediate problem for now.)

        Comment

        Working...
        X