Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • dropping unmatched obs

    Hello~~

    i have a monthly dataset that goes from the year of 2015 to 2020. and a list of companies for each month.
    how can I check if the companies I have in the first month for example are the same companies in the 2nd month and so on.

    ps: my company names are a code name mixed by alphabet and numbers

    TIA

  • #2
    It's not entirely clear what kind of results you are looking for. You could do something like this:
    Code:
    by firm (month), sort: gen byte appears_every_month = date[1] = tm(2015m1) & date[_N] == tm(2020m12) ///
        & _N == tm(2020m12)-tm(2015m1)+1
    Those firms that appear in every month will have appears_every_month = 1, and those that don't will have appears_every_month = 0.

    Is that what you have in mind?

    By the way, it is the norm in this community that we use our real first and last names as our username, to promote collegiality and professionalism. I'm guessing that Dummy is not your real family name--my apologies if it is. If I have guessed right, please press the Contact Us button at the bottom of the Forum window and message the system administrator to change the name on your account. Thank you.

    Comment


    • #3
      I think the following changes are necessary in post #2, although Clyde had more courage than did I.

      I chose to pass by this topic because of the lack of usable example data presented with dataex. In particular, if the variable that specifies the date (called "month" in this code) is not a Stata Internal Format monthly date - if for example is is a daily date with the first day of the month or last day of the month or first business day of the month or last business day of the month or something else entirely, such as a string - this code will need some additional work.

      Code:
      by firm (month), sort: gen byte appears_every_month = month[1] == tm(2015m1) & month[_N] == tm(2020m12) ///
          & _N == tm(2020m12)-tm(2015m1)+1

      Comment


      • #4
        @William Lisowski @Clyde Schechter thank you for the replies
        I tried the code you proposed but for some reason that new variable is 0 for every firm which is impossible. i can visually see many firms appear through many months
        what could be the problem

        Comment


        • #5
          What could be the problem? Well, almost anything. To chase this down, you need to fire up the -dataex- command and show some example data. Be sure the example you show exhibits the problem you are encountering the code. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



          Comment

          Working...
          X