Dropping very specific cells over multiple columns

Helen Liu

Join Date: Jun 2022
Posts: 2

Dropping very specific cells over multiple columns

10 Jun 2022, 17:19

Dear All,

I would like some help editing a dataset I have, which I have included an example of below:

v103	v106	v109	v112	v115	v118
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17
14	41	22	20	23	17

This is considered one group of data points.

What I would like the table to look like is this:

v103	v106	v109	v112	v115	v118
14	0	0	0	0	0
14	0	0	0	0	0
14	0	0	0	0	0
0	41	0	0	0	0
0	41	0	0	0	0
0	41	0	0	0	0
0	0	22	0	0	0
0	0	22	0	0	0
0	0	22	0	0	0
0	0	0	20	0	0
0	0	0	20	0	0
0	0	0	20	0	0
0	0	0	0	23	0
0	0	0	0	23	0
0	0	0	0	23	0
0	0	0	0	0	17
0	0	0	0	0	17
0	0	0	0	0	17

The issue is this is a very big dataset with 13,176 rows and 36 columns in total, and one column could have several groups of the first table shown above. I have looked at resources online, which only explain how to delete all the variables using drop. I did find an old Statalist post with a similar question, but the post asked about dropping missing values while I'm trying to drop a certain number of values.

Does anyone have any suggestions or advice on how to approach this problem? Thank you very much in advance!

Lastly, I'm using Stata 15.1.

Best,
Helen

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30117
#2

10 Jun 2022, 17:48

Well, you are not looking to -drop- anything here. You are looking to replace the existing values by zeroes. And that's a good thing, because in Stata you can drop whole observations or whole variables, but it is not possible to drop individual cells (or groups of cells other than observations or variables.)

There is an apparent pattern to what you want to do. Your variable names increment by 3, and you also want the non-zeroes to be retained in groups of three within a variable. It's also true that the variables you are showing are all actually constants. I don't know if this is coincidental to the example you show, or if it is, in fact, the general pattern. I will assume that at least the incrementing by threes is a general description of the situation. If it is not, please post back with a fuller explanation of what you are looking for.

Code:

* Example generated by -dataex-. For more info, type help dataex clear input byte(v103 v106 v109 v112 v115 v118) 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 14 41 22 20 23 17 end forvalues i = 3(3)18 { local varnum = `i' + 100 replace v`varnum' = 0 if !inrange(_n, `=`i'-2', `i') }

In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

Finally, I am very curious why you want to do this. It is one of the oddest data management operations I have encountered and I am unable to picture what purpose it serves.
3 likes
Comment
Helen Liu

Join Date: Jun 2022

Posts: 2
#3

11 Jun 2022, 22:19

Hi Clyde,

Thank you very much for your response and my apologies for the wrong format of the example. It was my first time posting and I will keep that in mind for future posts!

While the code was able to run, it wasn't completely what I was looking for. I ended up having to do everything manually because there is a time crunch on this project. I also agree the data management could have been better. Once again, I really appreciate your help!

Best,
Helen
Comment

Announcement

Dropping very specific cells over multiple columns

Comment

Comment