How to Remove Some Data Correctly in Wide Format in Stata?

smith Jason

Join Date: Sep 2020

Posts: 380
#1

How to Remove Some Data Correctly in Wide Format in Stata?

24 Jul 2022, 13:09

I have a wide format dataset like this,
clear
input byte (id state1 state2 state3)
1 0 0 1
2 0 1 1
3 1 1 1
4 0 0 0
5 1 0 1
end

I want to remove some data with rule below,
for any person, as long as the 1st 1 appeared on the state variable, for example,
when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of missing values.
when id==5, state1==1, then state2=0 and state3==1 should be dropped instead of missing values.
when id==2, state1==0 & state2=1, then state3==1 should be dropped instead of missing values.
As for the data with id==1 and id==4, they should be kept without any change.
I can reshape the data from wide to long to do this, but I want to do it directly in wide format data.

Thank you!

Last edited by smith Jason; 24 Jul 2022, 13:11.
Tags: None
William Lisowski

Join Date: Dec 2014

Posts: 10150
#2

24 Jul 2022, 14:53

By this point you should understand that "drop" has two technical meanings in Stata
remove a variable from every observation of the dataset

remove an observation from the dataset

Neither of these explains what you mean when you use "drop" in the following explanation

when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of missing values.

You apparently don't want the observation with id==3 to be removed from the dataset, so you apparently want state2 and state3 to take some value other than 1, but not any of the Stata missing values

Code:

. .a .b .c ... .x .y .z

So what is it you want the resulting dataset to contain for the observation when id==3?
Comment
smith Jason

Join Date: Sep 2020

Posts: 380
#3

24 Jul 2022, 14:59

when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of containing missing values.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30355
#4

24 Jul 2022, 15:04

Originally posted by smith Jason View Post

when id==3, state1==1, then state2=1 and state3==1 should be dropped instead of containing missing values.

As William Lisowski pointed out in #2, that is not possible. You can only drop an entire observation, or an entire variable. You cannot "drop" the values of a variable in only some observations. Stata is not a spreadsheet, and trying to work with it as if it were usually ends in tears. Whatever you think you might accomplish were this possible, you will need to find a way to do that with missing values instead. That, I promise, will not be difficult.
1 like
Comment
smith Jason

Join Date: Sep 2020

Posts: 380
#5

24 Jul 2022, 15:06

Thank you! It seems that I have to reshape the data from wide to long and do that.
Comment
William Lisowski

Join Date: Dec 2014

Posts: 10150
#6

24 Jul 2022, 15:12

You weren't paying attention when you read post #2. You can either drop an entire observation, or drop a variable from every observation. Neither of those is "drop a variable from just some observations".

A Stata dataset consists of the same variables in every observation. You will need to reshape your data to a long layout if you want id==3 to not have any observation of state2 or state3, and you made it clear that you know you can do that but do not want to reshape long.

What you seek to accomplish is not possible in Stata.
1 like
Comment

Announcement

How to Remove Some Data Correctly in Wide Format in Stata?

Comment

Comment

Comment

Comment

Comment