Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merging data sets variables do not uniquely identify observations in the master data

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str384 DirectorName double DirectorID long cikcode_n float fyear double(ExVol StockPrice ExercisePrice)
    "David Storch"     27468 1750 2006 1320019 24.91 19.16
    "Tim Romenesko"   320199 1750 2006  148484 24.91 18.19
    "Jim Clark"       320204 1750 2006   98094 24.91 17.25
    "J McDonald"      324002 1750 2006   64140 24.91  13.5
    "Howard Pulsifer" 320200 1750 2006  114869 24.91  17.8
    "Holger Liepmann" 452869 1800 2006   18666 48.92 41.68
    "Miles White"      33500 1800 2006  131125 48.92 47.87
    "Rick Gonzalez"    33494 1800 2006   31921 48.92 39.49
    "Miles White"      33500 1800 2006  675674 48.92 53.62
    "Tom Freyman"      33484 1800 2006  106405 48.92 45.44
    "Miles White"      33500 1800 2006   37046 48.92 47.87
    "Bill Dempsey"     33481 1800 2006       . 48.92 44.15
    "Bill Dempsey"     33481 1800 2006   21267 48.92 46.33
    "Rick Gonzalez"    33494 1800 2006  532026 48.92 45.44
    "Rick Gonzalez"    33494 1800 2006       . 48.92 44.15
    "Tom Freyman"      33484 1800 2006   95764 48.92 53.62
    "Holger Liepmann" 452869 1800 2006   21267 48.92 46.33
    "Tom Freyman"      33484 1800 2006   24684 48.92 41.05
    "Miles White"      33500 1800 2006       . 48.92 44.15
    "Rick Gonzalez"    33494 1800 2006    2142 48.92 46.64
    "Rick Gonzalez"    33494 1800 2006   22861 48.92 45.78
    "Bill Dempsey"     33481 1800 2006    2279 48.92 43.86
    "Holger Liepmann" 452869 1800 2006    5479 48.92 52.24
    "Holger Liepmann" 452869 1800 2006    2286 48.92 43.72
    "Rick Gonzalez"    33494 1800 2006   25298 48.92 40.94
    "Miles White"      33500 1800 2006   38007 48.92 47.87
    "Holger Liepmann" 452869 1800 2006    8761 48.92 41.01
    "Rick Gonzalez"    33494 1800 2006   74483 48.92 42.48
    "Rick Gonzalez"    33494 1800 2006  478824 48.92 53.62
    "Holger Liepmann" 452869 1800 2006   17395 48.92  42.9
    "Holger Liepmann" 452869 1800 2006   15960 48.92 44.15
    "Rick Gonzalez"    33494 1800 2006  313200 48.92 46.33
    "Tom Freyman"      33484 1800 2006   17828 48.92 46.19
    "Tom Freyman"      33484 1800 2006       . 48.92 44.15
    "Holger Liepmann" 452869 1800 2006    2819 48.92 40.92
    "Miles White"      33500 1800 2006  440800 48.92 46.33
    "Bill Dempsey"     33481 1800 2006   27328 48.92 42.82
    "Rick Gonzalez"    33494 1800 2006   20702 48.92 45.78
    "Rick Gonzalez"    33494 1800 2006   25353 48.92 51.05
    "Miles White"      33500 1800 2006  159608 48.92 40.56
    "Tom Freyman"      33484 1800 2006   31131 48.92 41.64
    "Bill Dempsey"     33481 1800 2006       . 48.92 44.15
    "Tom Freyman"      33484 1800 2006   31920 48.92 42.48
    "Tom Freyman"      33484 1800 2006       . 48.92 44.15
    "Bill Dempsey"     33481 1800 2006       . 48.92 44.15
    "Bill Dempsey"     33481 1800 2006   58522 48.92 41.03
    "Tom Freyman"      33484 1800 2006   11767 48.92 46.91
    "Tom Freyman"      33484 1800 2006   29245 48.92 43.72
    "Tom Freyman"      33484 1800 2006   24839 48.92 46.91
    "Bill Dempsey"     33481 1800 2006   28771 48.92 48.29
    "Miles White"      33500 1800 2006       . 48.92 47.09
    "Holger Liepmann" 452869 1800 2006   11703 48.92 42.48
    "Bill Dempsey"     33481 1800 2006   31172 48.92 41.64
    "Rick Gonzalez"    33494 1800 2006    3761 48.92 49.79
    "Miles White"      33500 1800 2006  558628 48.92 45.44
    "Rick Gonzalez"    33494 1800 2006   37256 48.92 46.64
    "Rick Gonzalez"    33494 1800 2006    1144 48.92 51.05
    "Tom Freyman"      33484 1800 2006   29290 48.92 46.64
    "Miles White"      33500 1800 2006  342493 48.92 33.23
    "Holger Liepmann" 452869 1800 2006   17954 48.92 42.93
    "Bill Dempsey"     33481 1800 2006  106405 48.92 45.44
    "Holger Liepmann" 452869 1800 2006       . 48.92 44.15
    "Holger Liepmann" 452869 1800 2006    6691 48.92 52.24
    "Tom Freyman"      33484 1800 2006    2292 48.92  43.6
    "Tom Freyman"      33484 1800 2006   63800 48.92 46.33
    "Holger Liepmann" 452869 1800 2006   17954 48.92 42.93
    "Rick Gonzalez"    33494 1800 2006   48145 48.92 49.79
    "Rick Gonzalez"    33494 1800 2006    2875 48.92 34.76
    "Holger Liepmann" 452869 1800 2006   31920 48.92 41.03
    "Holger Liepmann" 452869 1800 2006     822 48.92 33.23
    "Bill Dempsey"     33481 1800 2006   74483 48.92 42.48
    "Holger Liepmann" 452869 1800 2006   15428 48.92 45.44
    "Miles White"      33500 1800 2006  169064 48.92 34.76
    "Miles White"      33500 1800 2006       . 48.92 44.15
    "Rick Gonzalez"    33494 1800 2006  351235 48.92 33.23
    "Bill Dempsey"     33481 1800 2006   47411 48.92  46.4
    "Holger Liepmann" 452869 1800 2006       . 48.92 44.15
    "Holger Liepmann" 452869 1800 2006    2286 48.92 43.72
    "Holger Liepmann" 452869 1800 2006   53201 48.92 53.62
    "Miles White"      33500 1800 2006    2121 48.92 47.09
    "Holger Liepmann" 452869 1800 2006   83000 48.92 44.15
    "Holger Liepmann" 452869 1800 2006   26601 48.92 38.62
    "Miles White"      33500 1800 2006       . 48.92 44.15
    "Bill Dempsey"     33481 1800 2006    2054 48.92 48.66
    "Miles White"      33500 1800 2006   23664 48.92 49.54
    "Miles White"      33500 1800 2006  372419 48.92 42.48
    "Rick Gonzalez"    33494 1800 2006   95764 48.92 41.03
    "Rick Gonzalez"    33494 1800 2006    2142 48.92 46.64
    "Bill Dempsey"     33481 1800 2006   25566 48.92 50.38
    "Tom Freyman"      33484 1800 2006   58522 48.92 41.03
    "Miles White"      33500 1800 2006       . 48.92 47.09
    "Rick Gonzalez"    33494 1800 2006   71191 48.92 42.64
    "Miles White"      33500 1800 2006  404340 48.92 41.03
    "Rick Gonzalez"    33494 1800 2006   19814 48.92 51.05
    "Rick Gonzalez"    33494 1800 2006       . 48.92 44.15
    "Miles White"      33500 1800 2006   41579 48.92 47.09
    "Bill Dempsey"     33481 1800 2006   95764 48.92 53.62
    "Bill Dempsey"     33481 1800 2006   22365 48.92 50.38
    "Holger Liepmann" 452869 1800 2006    3965 48.92 52.24
    "Bill Dempsey"     33481 1800 2006    2082 48.92 48.66
    end
    The above one is one of my dataset and it contains directors options information between years 2006 to 2020. I have another data set that contain financial data for firms between 2006 to 2020. I am trying to merge these two datasets but is giving foolowing error variables cikcode_n fyear do not uniquely identify observations in the master data. Can you help me how to merge these two datasets as a director can be part of more than 1 company , i cannot drop my duplicates.

    Thank you in advance.

  • #2
    I could give a more confident answer if you had also shown some example data from the other set. Also, since you don't show the -merge- command you tried, there is no way for me to know which data set is the master in this case.

    If your second data set is financial information about the firms on an annual basis, and if the data set you do show contains a firm id variable (I don't see one, but maybe I don't grasp the meaning of some of the variable names) shouldn't the second data set be uniquely identified by firm id and year? If not, why not? On the assumption I have this right, you should
    Code:
    use first_data_set_as_shown_in_#1, clear
    merge m:1 firm_id_variable fyear using second_data_set_not_shown_in_#1
    If that's not right, when posting back, show example data from both data sets and also show the -merge- command you tried that is giving you the problem.

    Comment


    • #3
      Hello Clyde thank you for your response unfortunately the code didnt run. Sorry i didnt keep earlier another dataset. I have tried using the
      Code:
        merge m:1 cikcode_n fyear using financialdata
      and i have got the following error variables cikcode_n fyear do not uniquely identify observations in the using
      data. The first dataset is my Directors data set and my second dataset is the finanacial data set. I am using cikcode( which is unique for each company but will not changes across years- i am using this as a firm id variable correct me if its wrong ) and fyear( Year) variable to merge data between the two datasets. As i have mentioned in the previous post that a director can be part of more than 1 company in a particular year, i cannot drop my duplicates.

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input str384 DirectorName double(DirectorID StockPrice) long cikcode_n float fyear double(ExercisePrice ExVol)
      "Jim Clark"       320204 14.68 1 2005  7.07   50000
      "David Storch"     27468 14.68 1 2005  7.49  448984
      "J McDonald"      324002 14.68 1 2005 11.74   42933
      "Howard Pulsifer" 320200 14.68 1 2005   7.2   51400
      "Jim Clark"       320204 14.68 1 2005 12.95   83446
      "Tim Romenesko"   320199 14.68 1 2005 14.11  141620
      "Howard Pulsifer" 320200 14.68 1 2005    14  100253
      "Tim Romenesko"   320199 14.68 1 2005  7.66   56000
      "J McDonald"      324002 14.68 1 2005   5.9   46500
      "David Storch"     27468 14.68 1 2005 14.06 1565254
      "Howard Pulsifer" 320200 24.91 1 2006  17.8  114869
      "J McDonald"      324002 24.91 1 2006  13.5   64140
      "David Storch"     27468 24.91 1 2006 19.16 1320019
      "Jim Clark"       320204 24.91 1 2006 17.25   98094
      "Tim Romenesko"   320199 24.91 1 2006 18.19  148484
      "J McDonald"      324002 32.39 1 2007  17.5    7954
      "David Storch"     27468 32.39 1 2007 14.96  128556
      "J McDonald"      324002 32.39 1 2007 15.52     206
      "Tim Romenesko"   320199 32.39 1 2007  6.95   30000
      "David Storch"     27468 32.39 1 2007 14.96   33161
      "Jim Clark"       320204 32.39 1 2007 17.74    8750
      "J McDonald"      324002 32.39 1 2007  6.95   30000
      "Tim Romenesko"   320199 32.39 1 2007 16.05   20094
      "David Storch"     27468 32.39 1 2007 17.96   27607
      "Howard Pulsifer" 320200 32.39 1 2007 25.33    7056
      "Jim Clark"       320204 32.39 1 2007 22.61   12750
      "J McDonald"      324002 32.39 1 2007 27.41    4951
      "J McDonald"      324002 32.39 1 2007 27.41    2589
      "Jim Clark"       320204 32.39 1 2007 25.33   10986
      "J McDonald"      324002 32.39 1 2007 27.41     116
      "Howard Pulsifer" 320200 32.39 1 2007 25.33    5078
      "J McDonald"      324002 32.39 1 2007  14.9    1000
      "Jim Clark"       320204 32.39 1 2007 15.32    2801
      "Howard Pulsifer" 320200 32.39 1 2007 16.17    7963
      "Howard Pulsifer" 320200 32.39 1 2007 25.33   11454
      "Jim Clark"       320204 32.39 1 2007 16.15   13288
      "David Storch"     27468 32.39 1 2007  22.4   59599
      "Jim Clark"       320204 32.39 1 2007  14.9    3000
      "Howard Pulsifer" 320200 32.39 1 2007 22.61   13000
      "Tim Romenesko"   320199 32.39 1 2007 22.61   25000
      "J McDonald"      324002 32.39 1 2007  3.19    2000
      "David Storch"     27468 32.39 1 2007 16.17  124868
      "Jim Clark"       320204 32.39 1 2007  6.95   30000
      "Tim Romenesko"   320199 32.39 1 2007 17.74   26750
      "Tim Romenesko"   320199 32.39 1 2007 25.51    8425
      "J McDonald"      324002 32.39 1 2007 15.52    5977
      "Howard Pulsifer" 320200 32.39 1 2007 13.85    5025
      "David Storch"     27468 32.39 1 2007 14.96   25724
      "Howard Pulsifer" 320200 32.39 1 2007 17.96    3873
      "David Storch"     27468 32.39 1 2007 23.49  195000
      "Tim Romenesko"   320199 32.39 1 2007  17.5    3977
      "J McDonald"      324002 32.39 1 2007  17.5    5759
      "Tim Romenesko"   320199 32.39 1 2007 16.05    4281
      "Howard Pulsifer" 320200 32.39 1 2007  14.9    3000
      "Tim Romenesko"   320199 32.39 1 2007 25.51    9341
      "J McDonald"      324002 32.39 1 2007  17.5     183
      "Howard Pulsifer" 320200 32.39 1 2007 23.07   12000
      "Jim Clark"       320204 32.39 1 2007 27.95    6397
      "J McDonald"      324002 32.39 1 2007  17.5    3405
      "Tim Romenesko"   320199 32.39 1 2007 23.07   15000
      "Howard Pulsifer" 320200 32.39 1 2007    17    4269
      "Howard Pulsifer" 320200 32.39 1 2007 17.74    9100
      "Jim Clark"       320204 32.39 1 2007 23.07    6000
      "Howard Pulsifer" 320200 32.39 1 2007  6.95   30000
      "Tim Romenesko"   320199 32.39 1 2007  14.9    4000
      "David Storch"     27468 32.39 1 2007 23.49  170000
      "David Storch"     27468 32.39 1 2007  22.4  116668
      "Jim Clark"       320204 32.39 1 2007 27.95    9615
      "Howard Pulsifer" 320200 32.39 1 2007 13.85    1860
      "Howard Pulsifer" 320200 19.25 1 2008 22.62   13000
      "Jim Clark"       320204 19.25 1 2008 15.33    2801
      "Tim Romenesko"   320199 19.25 1 2008 16.04   20094
      "Jim Clark"       320204 19.25 1 2008  14.9    3000
      "Jim Clark"       320204 19.25 1 2008 16.04    4281
      "Tim Romenesko"   320199 19.25 1 2008 25.51    9341
      "David Storch"     27468 19.25 1 2008 22.41   59599
      "Howard Pulsifer" 320200 19.25 1 2008 17.96    3873
      "Howard Pulsifer" 320200 19.25 1 2008 25.33   11454
      "David Storch"     27468 19.25 1 2008 16.18  118868
      "Jim Clark"       320204 19.25 1 2008  6.69   30000
      "Tim Romenesko"   320199 19.25 1 2008  14.9    4000
      "Tim Romenesko"   320199 19.25 1 2008  6.95   30000
      "Tim Romenesko"   320199 19.25 1 2008 17.49    3977
      "Howard Pulsifer" 320200 19.25 1 2008 25.33    7056
      "David Storch"     27468 19.25 1 2008 23.49  195000
      "Howard Pulsifer" 320200 19.25 1 2008 25.33    5078
      "Jim Clark"       320204 19.25 1 2008 17.74    8750
      "Jim Clark"       320204 19.25 1 2008 22.62   12750
      "Tim Romenesko"   320199 19.25 1 2008 17.74   26750
      "Tim Romenesko"   320199 19.25 1 2008 22.62   25000
      "David Storch"     27468 19.25 1 2008 17.96   27607
      "Howard Pulsifer" 320200 19.25 1 2008  14.9    3000
      "Howard Pulsifer" 320200 19.25 1 2008 13.85    5025
      "Jim Clark"       320204 19.25 1 2008 16.16   13288
      "Howard Pulsifer" 320200 19.25 1 2008 13.85    1860
      "Jim Clark"       320204 19.25 1 2008 27.94    9615
      "Jim Clark"       320204 19.25 1 2008 25.33    5493
      "Howard Pulsifer" 320200 19.25 1 2008 16.18    7963
      "Jim Clark"       320204 19.25 1 2008 27.94    6397
      "Tim Romenesko"   320199 19.25 1 2008 25.51    8425
      end
      label values cikcode_n cikcode_n
      label def cikcode_n 1 "0000001750", modify

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long cikcode_n float fyear double(AssetsTotal LiabilitiesTotal CommonSharesOutstanding BookValuePerShare)
         2 2005   732.23  417.486  32.586   9.6589
         2 2006  978.819  556.102  36.654  11.5326
         2 2007 1067.633   573.39  37.729  13.0998
         2 2008  1362.01  776.755  38.773  15.0944
         2 2009 1377.511  720.616  38.884  16.8937
         2 2010 1501.042  754.692  39.484  18.9167
         2 2011 1703.727  868.438  39.781  21.0112
         2 2012 2195.653 1329.631  40.273  21.4697
         2 2013   2136.9   1217.4  39.382  23.3254
         2 2014   2199.5   1198.8   39.56  25.2654
         2 2015     1515    669.9  35.423  23.8574
         2 2016   1442.1    576.3  34.515  25.0847
         2 2017   1504.1    589.9  34.354  26.6112
         2 2018   1524.7    588.4  34.716  26.9703
         2 2019   1517.2    611.3  34.788  26.0406
         2 2020     2079   1176.4  35.097  25.7173
       366 2005     1535    761.1   116.5   6.6429
       366 2006   1611.4    737.9   117.2   7.4531
       366 2007   1764.8    757.2   117.6    8.568
       366 2008     1921   1006.8   111.3   8.2138
       366 2009   1343.6    987.4    96.6   3.6874
       366 2010   1474.5   1040.1    97.2   4.4198
      1068 2005 1623.383  705.305  54.078  16.9769
      1068 2006  927.239   203.24  43.099  16.7985
      1068 2007 1288.165  557.038  43.794  16.6947
        39 2005    29495    30973 182.732  -8.0883
        39 2006    29145    29751 222.224   -2.727
        39 2007    28571    25914 249.398  10.6537
        39 2008    25175    28110 278.949 -10.5216
        39 2009    25438    28927 332.624 -10.4893
        39 2010    25088    29033  333.45 -11.8309
        39 2011    23848    30959 335.268 -21.2099
        39 2012    23510    31497 335.292  -23.821
        39 2013    42278    45009 261.069 -10.4608
        39 2014    43771    41750 697.475   2.8976
        39 2015    48415    42780 624.622   9.0215
        39 2016    51274    47489 507.294   7.4612
        39 2017    51396    47470 475.508   8.2564
        39 2018    60580    60749 460.611   -.3669
        39 2019    59995    60113 428.203   -.2756
        39 2020    62008    68875  621.48 -11.0494
        13 2005     42.9   36.117   9.993    .6788
        13 2006   63.188   48.265  11.485   1.2993
        13 2007   96.535   55.609  14.789   2.7673
        13 2008  120.017   75.504  14.323   3.1078
        13 2009   77.515   45.755  14.289   2.2227
        13 2010   74.791   39.617  14.319   2.4565
        13 2011   79.345   36.355  14.516   2.9616
        13 2012   94.104    32.11  16.959   3.6555
        13 2013  348.536   178.13  25.587   6.6599
        13 2014  414.365  233.141  26.267   6.8993
        13 2015  598.819  353.798  33.918   7.0509
        13 2016  498.634  308.552  34.162   5.5641
        13 2017  438.549   251.98   34.57   5.3968
        13 2018  392.582  214.022  34.816   5.1287
        13 2019  408.637   215.62  35.137   5.4933
        13 2020  419.314  215.703  35.367   5.7301
         9 2005  589.849  145.869  74.614   5.9504
         9 2006  638.022  150.352   75.27   6.4789
      4664 2005  545.423    8.494     9.6  55.9301
      4664 2005  545.423    8.494       .        .
      4664 2006  720.523    8.256     9.6  74.1945
      4664 2006  720.523    8.256       .        .
      4664 2007  815.468    1.678       .        .
      4664 2007  815.468    1.678     9.6  84.7698
      4664 2008  355.664   14.569       .        .
      4664 2008  355.664   14.569     7.2  47.3743
      4664 2009  581.896    1.541    6.48   89.561
      4664 2009  581.896    1.541       .        .
      4664 2010   671.58    1.947       .        .
      4664 2010   671.58    1.947   19.44  34.4461
      4664 2011  632.715    6.635   19.29  32.4562
      4664 2011  632.715    6.635       .        .
      4664 2012  468.013     1.52       .        .
      4664 2012  468.013     1.52   19.29  24.1832
      4664 2013  252.142    1.795   19.29  12.9781
      4664 2013  252.142    1.795       .        .
      4664 2014  223.333    1.533       .        .
      4664 2014  223.333    1.533   19.29  11.4982
      4664 2015   162.35    1.606       .        .
      4664 2015   162.35    1.606   19.29    8.333
      4664 2016   245.02    1.791   19.29  12.6091
      4664 2016   245.02    1.791       .        .
      4664 2017  245.562     1.36   19.29  12.6595
      4664 2017  245.562     1.36       .        .
      4664 2018  196.072    1.238       .        .
      4664 2018  196.072    1.238   19.29  10.1003
      4664 2019  286.612     .733   19.29  14.8201
      4664 2019  286.612     .733       .        .
      4664 2020   464.74     .804       .        .
      4664 2020   464.74     .804   19.29  24.0506
      1822 2005 1689.749  250.498 172.955   8.3215
      1822 2006 1675.208  227.099 172.216   8.4087
      1822 2007 1899.536  264.257 171.674   9.5255
      1822 2008 2109.078  279.727 171.066  10.6938
      1822 2009 1872.529  202.776 170.384   9.7999
      1822 2010 2051.492  250.485 170.074  10.5895
      1822 2011 2319.482  280.065 170.142  11.9866
      1822 2012 2468.012  347.259 169.601  12.5044
      1822 2013 2601.995  629.065 168.633  11.6995
      end
      label values cikcode_n cikcode_n
      label def cikcode_n 2 "0000001750", modify
      label def cikcode_n 9 "0000002601", modify
      label def cikcode_n 13 "0000003197", modify
      label def cikcode_n 39 "0000006201", modify
      label def cikcode_n 366 "0000061478", modify
      label def cikcode_n 1068 "0000730469", modify
      label def cikcode_n 1822 "0000859163", modify
      label def cikcode_n 4664 "0001230869", modify

      Comment


      • #4
        The problem is that your financials data is messed up. For some of the firms you have two observations for each year. For example cikcode_n 0001230869 has duplicate observations for every year from 2005 through 2020. So something went wrong when the financials data set was created and that needs to be fixed.

        Now, you are fortunate in that, at least in the example data, in these duplicate observations, you do not have any contradictory information to resolve. In each case, the pair of observations agree on the value of LiabilitiesTotal. As for the other variables, CommonSharesOutstanding and BookValuePerShare, in each year, one observation has a value and the other observation has missing value. The data set shouldn't have these duplicate observations. If you created this dataset yourself, or if it was created by a colleague, then you should review the data management that created it and fix the error(s) that led to this condition. In doing that, you may uncover other errors as well, and you should fix those, too. Then you will end up with a proper data set that is uniquely identified by cikcode_n and fyear.

        If the financials dataset was not created in your "shop," then you may or may not be able to get the source to fix the file for you. In that case, if you trust that the data are actually correct (which I would be skeptical of--if they made some mistakes and let it out their quality control can't be very good, why believe they did everything else correctly--but it is possible), then you can "fix" the data set with
        Code:
        foreach v of varlist AssetsTotal-BookValuePerShare {
            by cikcode_n fyear (`v'), sort: assert `v' == `v'[1] | missing(`v')
        }
        collapse (firstnm) AssetsTotal-BookValuePerShare, by(cikcode_n fyear)
        This code first verifies that in the full data set it remains true that you never have two observations of the same firm in the same year that have contradictory values for any of the financial variables. Important: If that part of the code gives you error messages, it means that in the full data set some of these duplicate observations actually contradict each other. That's a more serious problem and you really cannot use the data in that condition--you will need to find some way to get the correct values and create a new, correct data set. But, assuming the data passes that code with no error message, the second one reduces the data set to one observation per firm-year using the non-missing value. After that you will be able to:
        Code:
        use directors_data_set, clear
        merge m:1 cikcode_n fyear using fixed_version_of_financials_data_set
        Last edited by Clyde Schechter; 29 Jul 2023, 19:38.

        Comment


        • #5
          Thank you Clyde Schechter it worked perfectly for me.

          Comment

          Working...
          X