Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to Remove Some Data Correctly in Wide Format in Stata?

    I have a wide format dataset like this,
    clear
    input byte (id state1 state2 state3)
    1 0 0 1
    2 0 1 1
    3 1 1 1
    4 0 0 0
    5 1 0 1
    end

    I want to remove some data with rule below,
    for any person, as long as the 1st 1 appeared on the state variable, for example,
    when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of missing values.
    when id==5, state1==1, then state2=0 and state3==1 should be dropped instead of missing values.
    when id==2, state1==0 & state2=1, then state3==1 should be dropped instead of missing values.
    As for the data with id==1 and id==4, they should be kept without any change.
    I can reshape the data from wide to long to do this, but I want to do it directly in wide format data.

    Thank you!
    Last edited by smith Jason; 24 Jul 2022, 13:11.

  • #2
    By this point you should understand that "drop" has two technical meanings in Stata
    • remove a variable from every observation of the dataset
    • remove an observation from the dataset
    Neither of these explains what you mean when you use "drop" in the following explanation

    when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of missing values.
    You apparently don't want the observation with id==3 to be removed from the dataset, so you apparently want state2 and state3 to take some value other than 1, but not any of the Stata missing values
    Code:
    . .a .b .c ... .x .y .z
    So what is it you want the resulting dataset to contain for the observation when id==3?

    Comment


    • #3
      when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of containing missing values.

      Comment


      • #4
        Originally posted by smith Jason View Post
        when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of containing missing values.
        As William Lisowski pointed out in #2, that is not possible. You can only drop an entire observation, or an entire variable. You cannot "drop" the values of a variable in only some observations. Stata is not a spreadsheet, and trying to work with it as if it were usually ends in tears. Whatever you think you might accomplish were this possible, you will need to find a way to do that with missing values instead. That, I promise, will not be difficult.

        Comment


        • #5
          Thank you! It seems that I have to reshape the data from wide to long and do that.

          Comment


          • #6
            You weren't paying attention when you read post #2. You can either drop an entire observation, or drop a variable from every observation. Neither of those is "drop a variable from just some observations".

            A Stata dataset consists of the same variables in every observation. You will need to reshape your data to a long layout if you want id==3 to not have any observation of state2 or state3, and you made it clear that you know you can do that but do not want to reshape long.

            What you seek to accomplish is not possible in Stata.

            Comment

            Working...
            X