No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • to convert data into unique observation

    i had question but issue sorted out i don't know how to delete the post

    Last edited by Kiran Abro; 14 Jan 2019, 12:11.

  • #2
    It appears that your data are distinctly identified by gvkey and qdate (I think, your screenshot is nearly unreadable, advice about that follows below) and the three observations differ only in the value of public_date. If that is the case, the following code should retain just the first of the observations for each combination of gvkey and qdate.
    bysort gvkey qdate (public_date): keep if _n==1
    Note that this code is not tested since no example data was given. It may be wrong. In particular, it does nothing to confirm the assertion that public_date is the only variable that differs; if others differ you will only be getting the value from the one observation kept.

    An alternative approach would be to use the duplicates command as described in the output of help duplicates.

    Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

    Note that screenshots are not too helpful, because (unlike samples of the data) it leaves the person who wants to help you the uninviting task of retyping the data to create a workable example. Instead, use the dataex command to present example data. If you are running version 15.1 or a fully updated version 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use dataex.

    The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

    Added in edit: the original poster subsequently removed all the content from post #1, so in general it was a very fuzzy screenshot of a dataset of financial data with a number of identification variables, and a further number of data variables. In general for each combination of gvkey and qdate, all the data variables shown in the screenshot appeared to be identical across three observations with different values of public_date. With actual example data, I would have demonstrated using the duplicates command to confirm that the observations did not differ (other than on public_date) before dropping two out of each set of three observations.
    Last edited by William Lisowski; 14 Jan 2019, 12:33.


    • #3
      At we specifically ask you not to do that.

      You cannot delete a thread you started. Please don't mangle your own posts starting a thread, even if you solved your problem yourself or realised that the question was silly. Explain the solution, even if it was trivial. Often someone else will have the same problem.


      • #4
        Thank you William Lisowski and Nick Cox. and yes sure i wont delete the post. This i realized last night that stata users should also post problems with solution. when they have sorted out as it can benefit others.