Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Repeated variables

    I have a problem because I have a data with a variable called houseID, but the houseID is the same for diffetent members, for example:

    houseID income tax total_tax_by_house
    001 10 2.3 54.4
    001 200 52.1 54.4
    001 0 0 54.4
    002 ...

    and i just want to keep one for each house number, or create a new variable that list the members in the same houseID, any idea?

  • #2
    So the first question is why you have multiple records for the same houseID. I'm guessing that the houseID just identifies a house, and that there can be multiple people in that house. Is there no variable in your data set that identifies the people within the houseID?

    If there is no such variable, you might consider doing something like this:

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str3 houseid int income float(tax total_tax_by_house)
    "001"  10  2.3 54.4
    "001" 200 52.1 54.4
    "001"   0    0 54.4
    "002"   .    .    .
    end
    
    by houseid (income), sort: gen int person_num = _n
    reshape wide income tax total_tax_by_house, i(houseid) j(person_num)
    That said, why do you want to reduce the data set to one observation per houseID? If you are going to be doing calculations with the values of income, tax, and total_tax_by_house, you will probably find it is much easier to do if you leave the data in the multiple observations per houseID layout you already have.

    In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    Comment

    Working...
    X