dropping variables 1-9 but keeping above 10

Ben Kail

Join Date: Dec 2019
Posts: 15

dropping variables 1-9 but keeping above 10

14 Jul 2023, 11:21

Hello Stata Friends-

I have a dataset with hundreds of variables that begin with the letter r followed by a number (1-18: representing waves) and a suffix, all in wide format. Here is an example of one variable:

Code:

clear all 

input r1var1 r2var1 r3var1 r4var1 r5var1 r6var1 r7var1 r8var1 r9var1 r10var1 r11var1 r12var1 r13var1 r14var1 r15var1 r16var1 r17var1 r18var1

612 941 297 980 175 24 730 854 55 474 597 185 25 170 23 720 649 335 174
537 819 752 629 954 200 905 117 553 467 384 838 150 458 693 99 609 184 710
125 597 738 303 617 696 809 375 466 27 612 171 15 112 96 952 579 393 136
764 323 851 344 180 590 687 153 308 300 317 443 965 222 543 533 188 503 260
400 964 291 255 244 396 667 857 872 509 852 457 681 929 925 17 849 164 291
348 173 408 410 885 960 756 606 593 854 534 668 460 310 959 912 328 370 423
547 842 516 287 209 158 423 816 888 456 955 137 758 500 349 919 181 118 24
403 552 470 708 130 696 948 830 552 669 754 93 103 381 129 45 428 795 302
267 384 124 587 442 384 553 174 486 484 321 434 988 427 894 491 296 634 689
550 717 792 580 806 412 498 186 507 781 353 635 364 341 283 864 112 300 263
749 791 612 311 520 330 533 542 919 183 512 224 174 155 429 109 455 199 204
346 208 60 388 598 191 18 4 240 329 133 56 841 397 77 773 712 462 943
207 457 200 539 720 783 508 460 264 400 492 946 540 590 844 400 526 759 495
743 947 191 417 738 794 214 98 78 643 169 780 380 174 865 663 545 724 898
537 511 923 286 593 985 626 206 797 129 584 91 271 680 424 520 943 665 966
168 498 101 536 395 365 608 820 905 87 396 538 124 872 778 771 509 370 849
832 704 235 352 342 667 437 412 910 601 27 408 893 994 881 750 40 608 593
776 632 610 930 177 973 27 700 889 697 598 428 479 894 29 327 191 224 755
end

Because of the sheer size of the dataset, I want to drop all of the waves that I'm not using (in this case waves 1 through 7), but want to do this before I convert it to a long data set.

My initial attempt was to use forvalues like this:

Code:

forvalues i=1/7 {
    drop r`i'*
    }

This does a great job of dropping each instance of the variable between r1var and r7var, but it also drops all the variables between r10var and r18var because they also begin with "r1". In other words, all that remains are waves 8 and 9.

With a small dataset like the one I created for this example, I could obviously just write it all out, but since I'm dealing with hundreds of variables, I'd really like to find a programing based solution.

For what it's worth, I'm running Stata15 on a Windows machine.

Thanks in advance for any assistance. And may you have a wonderful day.

Kind regards,
Ben

Tags: None

Rich Goldstein

Join Date: Mar 2014

Posts: 4491
#2

14 Jul 2023, 11:44

Code:

forval i=1/7 { drop r`i'var1 }
2 likes
Comment
Nils Enevoldsen

Join Date: Oct 2014

Posts: 296
#3

14 Jul 2023, 15:15

I imagine that "var1" was an example, and that several variations exist. Assuming that every variable you want to drop follows the pattern "r" + "digits 1 though 7" + "capital or lowercase letter" + "anything (or nothing)", then this should work:

Code:

foreach var of varlist * { if regexm("`var'","^r[1-7][A-Za-z]") drop `var' }
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35780
#4

15 Jul 2023, 02:03

Every time you drop a variable you reorganize the dataset. I have not done any speed comparisons but if this was my problem i would separate the code into identifying a list of variables to be dropped and then dropping them all at once. So,

Code:

foreach var of varlist * { if regexm("`var'", "^r[1-7][A-Za-z]") local todrop `todrop' `var' } display "`todrop'" drop `todrop'

Last edited by Nick Cox; 15 Jul 2023, 02:05.
2 likes
Comment
Ben Kail

Join Date: Dec 2019

Posts: 15
#5

17 Jul 2023, 08:04

Thank you all! This is exactly what I was looking for. I appreciate it and which you all the best. Cheers
Comment

Announcement

dropping variables 1-9 but keeping above 10

Comment

Comment

Comment

Comment