Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • generating new variable based on observations from other variables (panel data)

    Dear Stata experts,

    I am trying to generate new variable based on the following command
    bysort firmid year: gen finlev = (debt_n/ TA)

    But, the results are not correct, i.e., if doing this calculations (finlev = (debt_n/ TA)) mannually the results are different that those generated by the above command.
    I would appreciate your suggestions.

    below is the dataex of mine.


    Code:
    input int(firmid year) str13 identifier int(TA debt_n)
     1 2011 "KABO.CA"  3201 6755
     1 2012 "KABO.CA"  2438 6241
     1 2013 "KABO.CA"  2410 5870
     1 2014 "KABO.CA"  1925 5250
     1 2015 "KABO.CA"  1491 3550
     1 2016 "KABO.CA"  7068  632
     1 2017 "KABO.CA"  7435 1198
     1 2018 "KABO.CA"  7610  631
     1 2019 "KABO.CA"  7657 1025
     1 2020 "KABO.CA"  7661  978
     2 2011 "AKSA.IS"  8167 3327
     2 2012 "AKSA.IS"  8162 1850
     2 2013 "AKSA.IS"  8048 1803
     2 2014 "AKSA.IS"  8088 1817
     2 2015 "AKSA.IS"  7733 2813
     2 2016 "AKSA.IS"  7638 3041
     2 2017 "AKSA.IS"  8205 3999
     2 2018 "AKSA.IS"  7805 4367
     2 2019 "AKSA.IS"  7458 3351
     2 2020 "AKSA.IS"  7181 3262
     3 2011 "IHEVA.IS" 1852 1571
     3 2012 "IHEVA.IS" 2213 2859
     3 2013 "IHEVA.IS" 1227 1990
     3 2014 "IHEVA.IS"  702 6279
     3 2015 "IHEVA.IS" 8509 4088
     3 2016 "IHEVA.IS" 7537 3051
     3 2017 "IHEVA.IS" 7264 1960
     3 2018 "IHEVA.IS" 6269  891
     3 2019 "IHEVA.IS" 7798 7139
     3 2020 "IHEVA.IS" 7211  410

  • #2
    The by: prefix does no harm the calculation is the same without it.

    .
    Code:
      gen finlev = (debt_n/ TA) 
    
    .  
    .  list finlev 
    
         +----------+
         |   finlev |
         |----------|
      1. | 2.110278 |
      2. | 2.559885 |
      3. | 2.435685 |
      4. | 2.727273 |
      5. | 2.380952 |
         |----------|
      6. | .0894171 |
      7. | .1611298 |
      8. | .0829172 |
      9. | .1338644 |
     10. | .1276596 |
         |----------|
     11. | .4073711 |
     12. | .2266601 |
     13. | .2240308 |
     14. | .2246538 |
     15. | .3637657 |
         |----------|
     16. | .3981409 |
     17. | .4873857 |
     18. | .5595132 |
     19. | .4493162 |
     20. | .4542543 |
         |----------|
     21. | .8482721 |
     22. | 1.291911 |
     23. | 1.621842 |
     24. | 8.944445 |
     25. | .4804325 |
         |----------|
     26. |  .404803 |
     27. | .2698238 |
     28. | .1421279 |
     29. | .9154912 |
     30. | .0568576 |
         +----------+
    What do you get differently? Perhaps it is a matter of display format.

    Comment


    • #3
      Dear @Nick Cox, thanks for your usual help and support.
      In doing the calculations manually, the first observation (displayed above (2.110278) should be (0.338635) as it is the result of dividing debt_n (76,422,524.77) by TA (225,677,838.51).

      However, the first values for the debt_n and TA variables are not displayed as when you browse values, debt_n (76,422,524.77) and TA (225,677,838.51).




      Comment


      • #4
        I guess you imported string variables and pushed them through encode. The data shown in #1 are typically in thousands not millions. If so, the encode was wrong. You need destring.

        Comment


        • #5
          You were right as always!

          I have got the below results!

          Code:
          destring debt , replace
          debt: contains nonnumeric characters; no replace

          Comment


          • #6
            I meant, I had tried the original variable.

            Comment


            • #7
              Got this also!
              Code:
              destring debt, generate(totaldebt) ignore("NA")
              debt: contains characters not specified in ignore(); no generate

              Comment


              • #8
                Code:
                tab debt if missing(real(debt))

                Comment


                • #9
                  I did this, what should I do next, please?

                  Comment


                  • #10
                    Code:
                                 debt |      Freq.     Percent        Cum.
                    ------------------+-----------------------------------
                         1,004,151.98 |          1        0.01        0.01
                     1,007,065,579.60 |          1        0.01        0.03
                     1,008,266,123.85 |          1        0.01        0.04
                         1,008,462.62 |          1        0.01        0.05
                         1,012,093.32 |          1        0.01        0.07
                         1,012,847.93 |          1        0.01        0.08
                         1,015,376.97 |          1        0.01        0.09
                     1,017,862,589.03 |          1        0.01        0.11
                         1,018,566.67 |          1        0.01        0.12
                     1,018,948,213.90 |          1        0.01        0.13
                         1,018,983.05 |          1        0.01        0.15
                         1,020,268.36 |          1        0.01        0.16
                     1,020,297,939.45 |          1        0.01        0.17
                         1,022,618.85 |          1        0.01        0.19
                     1,024,606,487.73 |          1        0.01        0.20
                     1,027,299,582.05 |          1        0.01        0.21
                     1,028,244,562.50 |          1        0.01        0.23
                     1,032,587,490.10 |          1        0.01        0.24

                    Comment


                    • #11
                      Add the comma to the list of characters that destring is told to ignore.
                      Code:
                      destring debt, generate(totaldebt) ignore("NA,")

                      Comment


                      • #12
                        Thanks William Lisowski for your recommendation, it works fine.
                        But still having the rest of variables in the same format, preventing me from doing the right calculations.

                        The below variables are in numeric so I couldn't do the destring

                        Code:
                        TA    TE    CA
                        225,677,838.51    71,531,915.46    68,761,258.26
                        186,340,969.90    54,093,932.44    52,137,962.47
                        183,393,360.84    48,211,601.40    55,250,411.19
                        159,182,750.98    43,932,309.31    47,680,825.69
                        135,039,072.51    37,616,503.56    42,034,194.08
                        64,520,309.35    24,126,851.69    23,903,912.01
                        70,133,876.68    26,099,369.39    30,008,658.07
                        74,164,069.67    28,906,497.90    30,596,889.49
                        75,753,194.29    27,309,362.51    29,403,338.92
                        75,896,887.61    27,078,114.30    29,085,369.73
                        877,157,643.31    446,618,365.18    407,735,668.79
                        873,859,796.82    539,160,913.73    388,537,913.23
                        844,571,095.57    488,315,151.52    399,621,911.42
                        856,437,968.72    477,097,064.50    418,721,662.74

                        Comment


                        • #13
                          Code:
                           tab TA
                          
                                         TA |      Freq.     Percent        Cum.
                          ------------------+-----------------------------------
                           1,000,327,868.53 |          1        0.01        0.01
                           1,000,625,783.25 |          1        0.01        0.02
                           1,003,215,584.42 |          1        0.01        0.03
                           1,004,558,278.45 |          1        0.01        0.05
                           1,004,752,102.53 |          1        0.01        0.06
                           1,005,999,019.50 |          1        0.01        0.07
                           1,006,344,292.46 |          1        0.01        0.08
                           1,006,371,207.27 |          1        0.01        0.09
                           1,006,708,595.39 |          1        0.01        0.10
                           1,007,167,630.00 |          1        0.01        0.12
                           1,007,439,788.85 |          1        0.01        0.13
                           1,009,458,968.07 |          1        0.01        0.14
                           1,011,183,689.63 |          1        0.01        0.15


                          Code:
                           destring TA , generate(totalasset) ignore("NA,")
                          TA already numeric; no generate

                          Comment


                          • #14
                            At this point please run
                            Code:
                            dataex firmid year identifier TA TE CA debt_n, count(10)
                            on the dataset you describe in post #3 an present the results here.

                            With the count option used, dataex will output the first 10 observations. The output of dataex will look something like the following.
                            Code:
                            ----------------------- copy starting from the next line -----------------------
                            [CODE]
                            * Example generated by -dataex-. To install: ssc install dataex
                            clear
                            input int(x1 x2 x3) float x4 int x5 byte x6
                             4195 24 1   2 10 0
                            10371 16 3 3.5 17 0
                             4647 28 3   2 11 0
                            ...
                             5079 24 4 2.5  8 1
                             8129 21 4 2.5  8 1
                             4296 21 3 2.5 16 1
                            end
                            label values x6 yesno
                            label def yesno 0 "No", modify
                            label def yesno 1 "Yes", modify
                            [/CODE]
                            ------------------ copy up to and including the previous line ------------------
                            In your dataex output you will select the lines between, but not including, "copy starting from the next line" and "copy up to and including the previous line" and then paste that into your reply. The result presented in your post will look something like the following.
                            Code:
                            * Example generated by -dataex-. To install: ssc install dataex
                            clear
                            input int(x1 x2 x3) float x4 int x5 byte x6
                             4195 24 1   2 10 0
                            10371 16 3 3.5 17 0
                             4647 28 3   2 11 0
                            ...
                             5079 24 4 2.5  8 1
                             8129 21 4 2.5  8 1
                             4296 21 3 2.5 16 1
                            end
                            label values x6 yesno
                            label def yesno 0 "No", modify
                            label def yesno 1 "Yes", modify
                            Last edited by William Lisowski; 05 Feb 2022, 17:41. Reason: Reduced the number of observations output by dataex

                            Comment


                            • #15
                              Thanks for your replies, William Lisowski I got the below results.

                              Code:
                              * Example generated by -dataex-. To install: ssc install dataex
                              clear
                              input int(firmid year) str13 identifier int(TA TE CA debt_n)
                              1 2011 "KABO.CA" 3201 7548 7154 6755
                              1 2012 "KABO.CA" 2438 6479 6305 6241
                              1 2013 "KABO.CA" 2410 6008 6450 5870
                              1 2014 "KABO.CA" 1925 5734 5887 5250
                              1 2015 "KABO.CA" 1491 5048 5593 3550
                              1 2016 "KABO.CA" 7068 3565 3355  632
                              1 2017 "KABO.CA" 7435 3786 4399 1198
                              1 2018 "KABO.CA" 7610 4057 4437  631
                              1 2019 "KABO.CA" 7657 3895 4078 1025
                              1 2020 "KABO.CA" 7661 3886 4051  978
                              end
                              format %ty year
                              label values TA TA
                              label def TA 1491 "135,039,072.51", modify
                              label def TA 1925 "159,182,750.98", modify
                              label def TA 2410 "183,393,360.84", modify
                              label def TA 2438 "186,340,969.90", modify
                              label def TA 3201 "225,677,838.51", modify
                              label def TA 7068 "64,520,309.35", modify
                              label def TA 7435 "70,133,876.68", modify
                              label def TA 7610 "74,164,069.67", modify
                              label def TA 7657 "75,753,194.29", modify
                              label def TA 7661 "75,896,887.61", modify
                              label values TE TE
                              label def TE 3565 "24,126,851.69", modify
                              label def TE 3786 "26,099,369.39", modify
                              label def TE 3886 "27,078,114.30", modify
                              label def TE 3895 "27,309,362.51", modify
                              label def TE 4057 "28,906,497.90", modify
                              label def TE 5048 "37,616,503.56", modify
                              label def TE 5734 "43,932,309.31", modify
                              label def TE 6008 "48,211,601.40", modify
                              label def TE 6479 "54,093,932.44", modify
                              label def TE 7548 "71,531,915.46", modify
                              label values CA CA
                              label def CA 3355 "23,903,912.01", modify
                              label def CA 4051 "29,085,369.73", modify
                              label def CA 4078 "29,403,338.92", modify
                              label def CA 4399 "30,008,658.07", modify
                              label def CA 4437 "30,596,889.49", modify
                              label def CA 5593 "42,034,194.08", modify
                              label def CA 5887 "47,680,825.69", modify
                              label def CA 6305 "52,137,962.47", modify
                              label def CA 6450 "55,250,411.19", modify
                              label def CA 7154 "68,761,258.26", modify
                              label values debt_n debt_n
                              label def debt_n 631 "10,676,483.48", modify
                              label def debt_n 632 "10,679,273.93", modify
                              label def debt_n 978 "12,093,974.46", modify
                              label def debt_n 1025 "12,490,445.07", modify
                              label def debt_n 1198 "13,012,446.19", modify
                              label def debt_n 3550 "29,412,404.42", modify
                              label def debt_n 5250 "48,973,323.72", modify
                              label def debt_n 5870 "59,489,220.98", modify
                              label def debt_n 6241 "64,812,811.82", modify
                              label def debt_n 6755 "76,422,524.77", modify

                              Comment

                              Working...
                              X