Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problems reshaping data

    Hello,

    I am trying to reshape in a way that I can have separate columns for Country, Variable. IND, Industry,Time, Unit and PowerCode. Value column shows the number for each variable. However, when I do the following code:

    Code:
     reshape wide Variable IND Industry Time Unit PowerCode Value, i(Country) j(Variable) string
    I get the error that Variable has missing values (though Variable just describes each variable). Then, I use trim() to remove any leading or trailing spaces and then proceed with the reshape command. However, I get "missing values". I suspect that the commas are disorganizing things. Thank you very much!

    The dataex is

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str14 Country str85 Variable str7 IND str173 Industry int Time str18 Unit str9 PowerCode double Value
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1970 "Czech Koruna" "Millions"    434807
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1971 "Czech Koruna" "Millions"  455923.9
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1972 "Czech Koruna" "Millions"    475945
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1973 "Czech Koruna" "Millions"  502146.4
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1974 "Czech Koruna" "Millions"  535840.8
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1975 "Czech Koruna" "Millions"  580369.8
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1976 "Czech Koruna" "Millions"  610374.2
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1977 "Czech Koruna" "Millions"  631644.7
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1978 "Czech Koruna" "Millions"  667986.4
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1979 "Czech Koruna" "Millions"  690771.4
    "Czech Republic" "Value added, volumes"                      "D05T39" " Industry including energy [B-E]"                                                                        1980 "Czech Koruna" "Millions"  696086.8
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1970 "Czech Koruna" "Millions"    5593.6
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1971 "Czech Koruna" "Millions"    6163.7
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1972 "Czech Koruna" "Millions"    6691.9
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1973 "Czech Koruna" "Millions"    7096.7
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1974 "Czech Koruna" "Millions"    6753.2
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1975 "Czech Koruna" "Millions"    7345.3
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1976 "Czech Koruna" "Millions"    7832.1
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1977 "Czech Koruna" "Millions"    7400.9
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1978 "Czech Koruna" "Millions"    7738.7
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1979 "Czech Koruna" "Millions"    7261.6
    "Czech Republic" "Intermediate inputs, volumes"              "D55T56" " Accommodation and food service activities [I]"                                                          1980 "Czech Koruna" "Millions"    7681.7
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1970 "Czech Koruna" "Millions" 3302213.5
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1971 "Czech Koruna" "Millions"   3497501
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1972 "Czech Koruna" "Millions" 3705991.4
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1973 "Czech Koruna" "Millions" 3891624.2
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1974 "Czech Koruna" "Millions" 4058088.5
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1975 "Czech Koruna" "Millions"   4263618
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1976 "Czech Koruna" "Millions" 4404485.3
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1977 "Czech Koruna" "Millions" 4579830.7
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1978 "Czech Koruna" "Millions"   4771747
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1979 "Czech Koruna" "Millions" 4868862.2
    "Czech Republic" "Production (gross output), volumes"        "D01T99" "TOTAL"                                                                                                   1980 "Czech Koruna" "Millions" 4974724.3
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1970 "Czech Koruna" "Millions"    1153.1
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1971 "Czech Koruna" "Millions"    1306.1
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1972 "Czech Koruna" "Millions"    1154.2
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1973 "Czech Koruna" "Millions"     801.8
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1974 "Czech Koruna" "Millions"    1088.1
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1975 "Czech Koruna" "Millions"    1764.6
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1976 "Czech Koruna" "Millions"      1736
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1977 "Czech Koruna" "Millions"      2200
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1978 "Czech Koruna" "Millions"    2423.2
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1979 "Czech Koruna" "Millions"    3593.9
    "Czech Republic" "Value added, current prices"               "D58T63" " Information and communication [J]"                                                                      1980 "Czech Koruna" "Millions"    5591.6
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1970 "Czech Koruna" "Millions"    2589.4
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1971 "Czech Koruna" "Millions"    2845.9
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1972 "Czech Koruna" "Millions"    3070.6
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1973 "Czech Koruna" "Millions"    3308.4
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1974 "Czech Koruna" "Millions"    3510.1
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1975 "Czech Koruna" "Millions"    3783.8
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1976 "Czech Koruna" "Millions"    3982.6
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1977 "Czech Koruna" "Millions"    4048.3
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1978 "Czech Koruna" "Millions"    4211.6
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1979 "Czech Koruna" "Millions"    4490.8
    "Czech Republic" "Intermediate inputs, current prices"       "D85"    " Education [P]"                                                                                          1980 "Czech Koruna" "Millions"    4684.9
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1970 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1971 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1972 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1973 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1974 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1975 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1976 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1977 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1978 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1979 "Czech Koruna" "Millions"         0
    "Czech Republic" "Intermediate inputs, current prices"       "D99"    " Activities of extraterritorial organizations and bodies [U]"                                            1980 "Czech Koruna" "Millions"         0
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1970 "Czech Koruna" "Millions"   20907.2
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1971 "Czech Koruna" "Millions"   22408.7
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1972 "Czech Koruna" "Millions"   23300.9
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1973 "Czech Koruna" "Millions"     24386
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1974 "Czech Koruna" "Millions"   25351.5
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1975 "Czech Koruna" "Millions"   26875.2
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1976 "Czech Koruna" "Millions"   29168.3
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1977 "Czech Koruna" "Millions"   29348.8
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1978 "Czech Koruna" "Millions"   30302.1
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1979 "Czech Koruna" "Millions"     31440
    "Czech Republic" "Value added, current prices"               "D68"    " Real estate activities [L]"                                                                             1980 "Czech Koruna" "Millions"   33267.5
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1970 "Czech Koruna" "Millions"    292443
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1971 "Czech Koruna" "Millions"  317756.7
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1972 "Czech Koruna" "Millions"  352325.3
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1973 "Czech Koruna" "Millions"  370906.5
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1974 "Czech Koruna" "Millions"  400220.5
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1975 "Czech Koruna" "Millions"  430987.2
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1976 "Czech Koruna" "Millions"    453343
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1977 "Czech Koruna" "Millions"  451410.1
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1978 "Czech Koruna" "Millions"  462665.7
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1979 "Czech Koruna" "Millions"  464004.8
    "Czech Republic" "Production (gross output), volumes"        "D41T43" " Construction [F]"                                                                                       1980 "Czech Koruna" "Millions"    478697
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1970 "Index"        "Units"    14.157515
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1971 "Index"        "Units"    14.314716
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1972 "Index"        "Units"    14.530975
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1973 "Index"        "Units"    14.908714
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1974 "Index"        "Units"    15.244295
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1975 "Index"        "Units"    15.537503
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1976 "Index"        "Units"    16.145145
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1977 "Index"        "Units"     16.16526
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1978 "Index"        "Units"     16.43737
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1979 "Index"        "Units"    16.943795
    "Czech Republic" "Production (Gross Output), deflators"      "D69T82" " Professional, scientific and technical activities; administrative and support service activities [M-N]" 1980 "Index"        "Units"    17.847356
    "Czech Republic" "Production (gross output), current prices" "D35T39" " Electricity, gas and water supply; sewerage, waste management and remediation activities [D-E]"         1970 "Czech Koruna" "Millions"   19330.4
    end




  • #2
    The -reshape- command is failing for a number of reasons. First, the variable Variable cannot serve as the -j()- option because its values are not suitable as variable names. So the first thing that must happen is the Variable must be "edited" into strings that can serve as variable names: no more than 31 characters,* letters and digits only, and the first character may not be a digit. Next, the -i()- option must consist of a variable or group of variables which, together with the -j()- variable, uniquely identifies observations. Looking at the data, it appears that -i()- should be Country IND Industry Time. It is the variable Value that is truly to be reshaped wide.

    Code:
    gen variable = strtoname(substr(trim(itrim(Variable)), 1, 31))
    
    drop Variable
    rename Value _
    reshape wide _, i(Country IND Industry Time) j(variable) string
    rename _* *
    *A variable name can be 32 characters long. However, -reshape- forms the variable names by appending the contents of the -j()- variable to whatever variable(s) is (are)named in the -reshape- varlist. Since such the variable(s) in the varlist are necessarily at least 1 character long, the contents of the -j()- variable cannot be more than 31 characters long.

    Comment


    • #3
      Thanks! I did not about the actual proper variable names. The variables that uniquely identify each observation are Country, Industry and Time. IND is the code for Industry. But I still want to keep it (it will make the plots easier when I have to divide by groups). When I do your code, particularly in the last part

      Code:
        reshape wide _, i(Country IND Industry Time) j(variable) string
      I get that variable contains missing value but when I tab it

      Code:
       tab Variable if missing(Variable)
      I find no missing observation. What could be the reason?

      Thank you so much!

      Comment


      • #4
        The variable variable is lower case. I do make that adjustment.
        Code:
         tab variable if missing(variable)

        Comment


        • #5
          -tab Variable if missing(Variable)- is not going to show you the missing values, because -tab- excludes missing values. To get -tab- to show you the missings, you have to use its -missing- option.

          Code:
          tab Variable, missing
          will show you the counts of both the non-missing and missing values of Variable.

          In any case, it is clear that if the variable Variable has a missing value, that observation has no usable information. So I would -drop if missing(variable) before the-reshape- in my code in #2.

          Comment


          • #6
            When I do

            Code:
             reshape wide _, i(Country IND Industry Time) j(variable) string
            in your #2. I get that there are observations with Country IND Industry and Time that do not uniquely identify observations I drop duplicates

            Code:
             duplicates drop
            But this still the resphape does not work. When I do reshape error, the problem seems to be in Gross_fixed_capital_formation__ where are duplicates. Is there a way I can rename the variables differently rather than __. Because I have Gross Fixed Capital Formation in dollar value and in volumes. That seems to be the issue. The names are becoming the same. also, by the descriptive stats, I get an abnormally high number given that the two seemed to be added by the ___

            Code:
              tab variable
            
                                   variable |      Freq.     Percent        Cum.
            --------------------------------+-----------------------------------
               Consumption_of_fixed_capital |      5,930        3.15        3.15
            Gross_Fixed_Capital_Formation__ |      6,230        3.31        6.47
            Gross_capital_stock__ICT_equipm |      2,965        1.58        8.05
            Gross_capital_stock__current_re |      3,411        1.81        9.86
               Gross_capital_stock__volumes |      3,360        1.79       11.65
            Gross_fixed_capital_formation__ |     23,463       12.48       24.13
            Gross_operating_surplus_and_mix |      6,132        3.26       27.39
                   Hours_worked___employees |      4,445        2.36       29.76
               Hours_worked___total_engaged |      4,757        2.53       32.29
            Intermediate_inputs__current_pr |      7,902        4.20       36.49
             Intermediate_inputs__deflators |      5,660        3.01       39.50
               Intermediate_inputs__volumes |      5,729        3.05       42.55
            Labour_costs__compensation_of_e |      8,036        4.27       46.82
            Net_capital_stock__ICT_equipmen |      3,805        2.02       48.85
            Net_capital_stock__current_repl |      6,263        3.33       52.18
                 Net_capital_stock__volumes |      6,190        3.29       55.47
            Net_operating_surplus_and_mixed |      5,165        2.75       58.22
                        Number_of_employees |      4,987        2.65       60.87
            Number_of_persons_engaged__tota |      7,036        3.74       64.62
            Other_taxes_less_subsidies_on_p |      5,835        3.10       67.72
            Production__Gross_Output___defl |      5,949        3.16       70.89
            Production__gross_output___curr |      7,912        4.21       75.10
            Production__gross_output___volu |      5,982        3.18       78.28
                              Self_employed |      4,874        2.59       80.87
                Value_added__current_prices |      9,741        5.18       86.05
                     Value_added__deflators |      7,634        4.06       90.11
                       Value_added__volumes |      7,790        4.14       94.26
            Value_added_at_factor_costs__cu |      6,128        3.26       97.52
                         Wages_and_salaries |      4,666        2.48      100.00
            --------------------------------+-----------------------------------
                                      Total |    187,977      100.00



            I do have the variables Unit (particular currencies or PowerCode reflecting amounts). I also try the reshape there but it does seem to work
            Last edited by Hugo Rocha; 16 Oct 2023, 11:20.

            Comment


            • #7
              For the gross fixed capital formation problem, you need to edit those so as to retain the distinction and also bring the total length down. You might try this:
              Code:
              replace Variable = subinstr(Variable, "Gross Fixed Capital Formation", ///
                  "Gr Fxd Cap Frmtn", .)
              at the very start of the code.

              I do have the variables Unit (particular currencies or PowerCode reflecting amounts). I also try the reshape there but it does seem to work
              Does work or doesn't? And if it doesn't, what in particular goes wrong?

              All in all, I think the code needs to be extended along these lines:
              Code:
              replace Variable = subinstr(Variable, "Gross Fixed Capital Formation", ///
                  "Gr Fxd Cap Frmtn", .)
              gen variable = strtoname(substr(trim(itrim(Variable)), 1, 31))
              
              drop Variable
              rename Value _
              rename Unit U
              rename PowerCode P
              reshape wide _ U P, i(Country IND Industry Time) j(variable) string
              rename _* *



              Last edited by Clyde Schechter; 16 Oct 2023, 12:13.

              Comment


              • #8
                I still cannot reshape when I try Unit or PowerCode (that was just me trying out, that's what I meant). This is what I tried (just to answer your question, I don't think that's the way to go).

                Code:
                  reshape wide _, i(Country IND Industry Time PowerCode Unit) j(variable) string

                Code:
                  describe
                
                Contains data
                 Observations:       237,903                  
                    Variables:             8                  
                ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                Variable      Storage   Display    Value
                    name         type    format    label      Variable label
                ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                Country         str14   %14s                  Country
                Variable        str85   %85s                  Variable
                IND             str7    %9s                   IND
                Industry        str173  %173s                 Industry
                Time            int     %10.0g                Time
                Unit            str18   %18s                  Unit
                PowerCode       str9    %9s                   PowerCode
                Value           double  %10.0g                Value
                ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
                Sorted by: 
                     Note: Dataset has changed since last saved



                I still get reshape error. Since I have capital stock data, I am thinking about deleting any reference of Gross Fixed Capital Formation below

                Code:
                 tab Variable
                
                                               Variable |      Freq.     Percent        Cum.
                ----------------------------------------+-----------------------------------
                           Consumption of fixed capital |      5,930        3.15        3.15
                Gross Fixed Capital Formation, deflat.. |      6,230        3.31        6.46
                Gross capital stock, ICT equipment, s.. |      1,711        0.91        7.37
                Gross capital stock, ICT equipment, s.. |      1,262        0.67        8.04
                Gross capital stock, current replacem.. |      3,411        1.81        9.86
                           Gross capital stock, volumes |      3,360        1.79       11.64
                Gross fixed capital formation, ICT eq.. |      3,805        2.02       13.67
                Gross fixed capital formation, ICT eq.. |      3,096        1.65       15.31
                Gross fixed capital formation, ICT eq.. |      2,995        1.59       16.90
                Gross fixed capital formation, curren.. |      7,435        3.95       20.86
                 Gross fixed capital formation, volumes |      6,270        3.33       24.19
                Gross operating surplus and mixed inc.. |      6,132        3.26       27.45
                               Hours worked - employees |      4,445        2.36       29.81
                           Hours worked - total engaged |      4,757        2.53       32.34
                    Intermediate inputs, current prices |      7,902        4.20       36.54
                         Intermediate inputs, deflators |      5,660        3.01       39.55
                           Intermediate inputs, volumes |      5,729        3.05       42.59
                Labour costs (compensation of employe.. |      8,036        4.27       46.87
                Net capital stock, ICT equipment, sof.. |      3,805        2.02       48.89
                Net capital stock, current replacemen.. |      6,263        3.33       52.22
                             Net capital stock, volumes |      6,190        3.29       55.51
                 Net operating surplus and mixed income |      5,165        2.75       58.25
                                    Number of employees |      4,987        2.65       60.90
                Number of persons engaged (total empl.. |      7,036        3.74       64.64
                Other taxes less subsidies on product.. |      5,835        3.10       67.75
                   Production (Gross Output), deflators |      5,949        3.16       70.91
                Production (gross output), current pr.. |      7,912        4.21       75.11
                     Production (gross output), volumes |      5,982        3.18       78.29
                                          Self-employed |      4,874        2.59       80.89
                Value added at factor costs, current .. |      6,128        3.26       84.14
                            Value added, current prices |      9,741        5.18       89.32
                                 Value added, deflators |      7,634        4.06       93.38
                                   Value added, volumes |      7,790        4.14       97.52
                                     Wages and salaries |      4,666        2.48      100.00
                ----------------------------------------+-----------------------------------
                                                  Total |    188,123      100.00
                How can I delete all the Variables that reference anything related to Gross Fixed Capital formation?
                How Can I keep all the capital stocks separately (without adding up to avoid duplicates)
                After this I will do the reshape as you indicated, deleting the missing values before.
                Last edited by Hugo Rocha; 16 Oct 2023, 13:23.

                Comment


                • #9
                  How can I delete all the Variables that reference anything related to Gross Fixed Capital formation?
                  Code:
                  drop if strmatch(lower(Variable), "*gross fixed capital formation*")
                  How Can I keep all the capital stocks separately (without adding up to avoid duplicates)
                  You will have to use -replace- statements to change all of those values of Variable to something that is short, but doesn't reduce any originally different values to the same value. I can't give you specific advice on that because the -tabulate- output you show truncates all of the values, so I can't see the full names you are starting with.

                  I suggest you try to do this yourself as follows:
                  Code:
                  contract Variable if strmatch(lower(Variable), "*capital stock*")
                  browse
                  Examine the different values of Variable you see, which will be all the ones that refer to capital stock. Then figure out a set of short names that will retain all of the differences between them. Write a bunch of -replace- statements similar to the one in #7, and put those at the beginning of the code.

                  If you can't work it out on your own, post back using -dataex- showing the data as it is after the -contract- command above and I'll give it a try.

                  Comment


                  • #10
                    Actually, you have been very helpful. Thank you so much! It worked out at the end. Let me show you my do-file because I think you are curious to know how it ended up.

                    Code:
                       // Cleaning STAN into Stata//
                    
                     import excel "/Users/hvacapereirarocha/Desktop/EU/STAN database/STAN_1970.xlsx", sheet("1970 to 1980") firstrow clear //Import Data  
                    
                     //Check all capital stock//
                     
                    contract Variable if strmatch(lower(Variable), "*capital stock*")
                    browse //Check all variables related to capital stock
                    
                     import excel "/Users/hvacapereirarocha/Desktop/EU/STAN database/STAN_1970.xlsx", sheet("1970 to 1980") firstrow clear //Import Data again knowing the full names for capital stock
                     drop if strmatch(lower(Variable), "*gross fixed capital formation*") // Since I have capital stock, I drop all gross fixed capital
                    
                    //Substitute names for capital stock// 
                    
                    replace Variable = subinstr(Variable, "Gross capital stock, ICT equipment, software and databases, current replacement costs",  "GCS_ICT_cost", .)
                    
                    replace Variable = subinstr(Variable,"Gross capital stock, ICT equipment, software and databases, volumes" ,  "GCS_ICT_vol", .)
                    
                    replace Variable = subinstr(Variable,"Gross capital stock, current replacement costs" ,  "GCS_cost", .)
                    
                    replace Variable = subinstr(Variable,"Gross capital stock, volumes" ,  "GCS_vol", .)
                    
                    replace Variable = subinstr(Variable,"Net capital stock, ICT equipment, software and databases, current replacement costs" ,  "NCS_ICT_cost", .)
                    
                    replace Variable = subinstr(Variable,"Net capital stock, current replacement costs" ,  "NCS_cost", .)
                    
                    replace Variable = subinstr(Variable,"Net capital stock, volumes" ,  "NCS_vol", .)
                    
                    
                    drop if missing(Variable)
                    
                     gen variable = strtoname(substr(trim(itrim(Variable)), 1, 31))
                     
                     drop Variable
                     
                     rename Value _
                    
                    
                     reshape wide _, i(Country IND Industry Time Unit PowerCode) j(variable) string
                     
                     
                     save "/Users/hvacapereirarocha/Desktop/EU/STAN database/STAN_1970.dta" //I will repeat the same process for the rest of the panels.
                    When I summarize the data

                    Code:
                     sum
                    
                        Variable |        Obs        Mean    Std. dev.       Min        Max
                    -------------+---------------------------------------------------------
                         Country |          0
                             IND |          0
                        Industry |          0
                            Time |     34,424    1975.468    3.150395       1970       1980
                            Unit |          0
                    -------------+---------------------------------------------------------
                       PowerCode |          0
                    _Consumpti~l |      5,930    218089.3     1788000          0   4.33e+07
                    _GCS_ICT_c~t |      1,711    1030.495     3091.93          0      33681
                    _GCS_ICT_vol |      1,262    1206.716    6048.499          0      71050
                       _GCS_cost |      3,411    593110.3     3834690          0   1.07e+08
                    -------------+---------------------------------------------------------
                        _GCS_vol |      3,360     4540173    2.37e+07          0   4.82e+08
                    _Gross_ope~x |      6,132    577907.5     6862416  -2.66e+08   2.67e+08
                    _Hours_wor~s |      4,445    1062.399    8041.569          0     167679
                    _Hours_wor~d |      4,757    915.5721    3203.952          0   42612.09
                    _Intermedi~r |      7,902     1375218    1.12e+07          0   2.92e+08
                    -------------+---------------------------------------------------------
                    _Intermed~rs |      5,660    33.13423    34.80268   2.396706    1397.26
                    _Intermed~es |      5,729     1781290    1.62e+07          0   2.87e+08
                    _Labour_co~e |      8,036    620688.1     5045124          0   1.28e+08
                    _NCS_ICT_c~t |      3,805    21379.68    386537.2          0   1.31e+07
                       _NCS_cost |      6,263    878362.9    1.42e+07          0   7.25e+08
                    -------------+---------------------------------------------------------
                        _NCS_vol |      6,190     2912377    2.22e+07          0   8.69e+08
                    _Net_opera~d |      5,165    380899.8     6491405  -2.66e+08   2.67e+08
                    _Number_of~s |      4,987    1573.541    6611.577          0      98394
                    _Number_of~a |      7,036    1706.873     6423.33          0     107052
                    _Other_tax~p |      5,835    145272.8     5637421  -2.65e+08   2.67e+08
                    -------------+---------------------------------------------------------
                    _Productio~l |      5,949    32.43458    36.11378   2.176136   863.8228
                    _Productio~r |      7,912     2628625    2.05e+07          0   5.45e+08
                    _Productio~u |      5,982     3513933    2.91e+07          0   5.69e+08
                    _Self_empl~d |      4,874    216.5059    1109.136          0      19534
                    _Value_a~ces |      9,741     1033498     8800130         -4   2.53e+08
                    -------------+---------------------------------------------------------
                    _Value_ad~rs |      7,634    35.19206    94.75956    .014887   2318.426
                    _Value_a~mes |      7,790     3134124    1.64e+07          0   2.85e+08
                    _Value_add~u |      6,128     1322470    1.12e+07  -2.63e+08   2.68e+08
                    _Wages_and~s |      4,666    43712.16     1741295          0   1.17e+08.

                    Comment


                    • #11
                      Glad you worked it out, and thank you for showing your solution!

                      Comment

                      Working...
                      X