Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merge data sets and keep specific observations

    Dear all,

    I have a problem merging two panel data sets (one data set with ESG Scores for firms and another data set containing firm fundamentals).
    I would like to keep the matched observations as well as the observations for the preceding year because I want to use lagged fundamentals.

    Example:
    obs. firm id year ESG Score firm assets
    1 1 2000 . 100
    2 1 2001 . 101
    3 1 2002 25 102
    4 1 2003 26 103
    5 2 2000 . 150
    6 3 2000 50 200

    Considering the above example, I would like to keep observations 2, 3, 4 and 6.

    Does anyone have an idea on how to solve my problem?

    I would greatly appreciate any help. I use Stata/SE 14.2.

    Many thanks in advance and kind regards from Germany

    Adriana

  • #2
    You don't explain what the data tableau you show represents. I'm guessing that it represents the results you have after -merge-ing the two data sets. I'm also guessing that the matched items are the ones where ESG score is not missing. Do I have that right? Presumably when you ran your -merge-, it generated a variable called _merge which identifies which observations are matched, which came from the data set originally in memory (master) and which came from the data set being merged in (using). If I have all of this correct:

    Code:
    xtset firm_id year
    by firm_id (year): keep if inlist(3, _merge, F._merge)
    ​​​​​​​should do the job.

    Comment


    • #3
      Hello Clyde,

      I am sorry I didn't explain my problem sufficiently. Yes, you guessed everything correctly

      I tried to apply your solution but unfortunately, I can't encode the firm_id variable in order to -xtset- the data because there are too many values.

      I think I found an amateurish solution to resolve my problem (?):

      Code:
      sort firm_id half_year
      gen y= 1 if _merge==3
      bysort firm_id: replace y=1 if half_year[_n+1]==half_year+2 & y[_n+1]==1
      keep if y==1
      Thank you very much for your help!

      Regards,

      Adriana

      Comment


      • #4
        I'm glad you found a solution that works.

        A couple of remarks for the future. In your data tableau, it appeared that firm_id is already a numeric variable. So the issue of -encode-ing it never crossed my mind. A big part of the problem is that by creating a tableau as you did, you obscure the metadata that is an important part of working with Stata data sets. The most helpful way to show example data is with the -dataex- command. If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        If you use -dataex- to show example data in the future, you will avoid misunderstandings like this one.

        Also, when -encode- fails because there are too many values, you can use the following alternative:

        Code:
        egen long n_firm_id = group(firm_id)
        The -egen, group()- function has no limit on the number of values, and it creates a numeric variable that is in one-one correspondence with the string variable firm_id, and that can be used by -xtset-

        Comment


        • #5
          Hello Clyde,

          I managed to implement your solution.

          Code:
          egen long id = group(firm_id)
          xtset id half_year
          by id (half_year): keep if inlist(3, _merge, F2._merge)
          Thank you so much. This is much more elegant than my solution.

          Adriana

          Comment


          • #6
            Hello Clyde,

            I just saw your reply. Thank you for your advice. I will use -dataex- next time. I did not know about it.

            Thank you very much for such a detailed answer. I appreciate it.

            Regards,

            Adriana

            Comment

            Working...
            X