Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping wide to long with multiple levels of loops

    Hi all, I have a complex dataset with multiple levels of loops. Here are the details:

    A. Two individuals per row (e.g., b4_tribe_1, b4_tribe_2, b8b_leave_rsn_1, b8b_leave_rsn_2)
    B. Multi-select responses (e.g. b8b_leave_rsn_1_1 b8b_leave_rsn_2_1, b8b_leave_rsn_3_1 for the first person and , b8b_leave_rsn_2_2, b8b_leave_rsn_3_2 for the second person)
    C. Looped questions per individual and option (e.g., b34a_eco_act_1_1, b34a_eco_act_1_2, b34a_eco_act_2_1, b34a_eco_act_2_2)

    I want to clean the dataset so that for each row, I have one variable, so that my variables look like the following:

    b4_tribe
    b8b_leave_rsn
    b8b_leave_rsn_1_
    b8b_leave_rsn_2_
    b8b_leave_rsn_3_
    b22b_main_tasks__1 //first main tasks for each person
    b22b_main_tasks__2 //second main tasks for each person
    b22b_main_tasks__3 //third main tasks for each person

    The issue is that for variables in category C above, I would have to rename each of them so that the last number represents the person instead of the activity number. I have 7000+ similar variables, so this would be an extremely cumbersome process. Is there an alternative way to approach this? Below is the code I am currently using.

    rename b34a_eco_act_1_1 b34a_eco_act1_1
    rename b34a_eco_act_1_2 b34a_eco_act2_1
    rename b34a_eco_act_2_1 b34a_eco_act1_2
    rename b34a_eco_act_2_2 b34a_eco_act2_2

    * Reshape person-level data to long format
    reshape long ///
    a16_refugee_no_ a17_fayda_owned_ a18_fayda_no_ ///
    b7a_oth_native_ lbl_native_lang_ b12_sp_lang_ ///
    b12_sp_lang_1_ b12_sp_lang_2_ b12_sp_lang_3_ ///
    b12_sp_lang_4_ b12_sp_lang_5_ b34a_eco_act1_ b34a_eco_act2_, ///
    i(calc_suid) j(person)
    Last edited by Denat Ephrem Negatu; 25 Jul 2025, 10:52.

  • #2
    You actually don't need to rename the variables. The reshape command allows you to specify where the j (here, person) appears in the variable name, using the @ symbol. So you could do:
    Code:
    reshape long ///
    a16_refugee_no_ a17_fayda_owned_ a18_fayda_no_ ///
    b7a_oth_native_ lbl_native_lang_ b12_sp_lang_ ///
    b12_sp_lang_1_ b12_sp_lang_2_ b12_sp_lang_3_ ///
    b12_sp_lang_4_ b12_sp_lang_5_ b34a_eco_act_@_1 b34a_eco_act_@_2, ///
    i(calc_suid) j(person)
    without having renamed the b34a_* variables beforehand.

    Incidentally, if you ever do need to do this kind of renaming, you can do it more concisely as follows:
    Code:
    rename b34a_eco_act_(#)_(#) b34a_eco_act_(#)[2]_(#)[1]
    This does the same job as the four lines of rename in your code above.

    See rule #19 in help rename group

    Comment


    • #3
      Thank you, Hemanshu! This seems to be working. Is there a fast way to reshape when I have 3000+ variables if I don't want to list each variable for the same context?

      Comment


      • #4
        I think you'll have to look for regularities in the names of your variables and then list the equivalent stub names.

        For instance, if you had a local macro called reshape_vars that already has the list of all variables that need to be reshaped, and you just wanted to remove the last digit from their names, you might do this:

        Code:
        foreach var of local reshape_vars {
            local stub = substr("`var'", 1, length("`var'") - 1)
            local reshape_stubs `reshape_stubs' `stub'
        }
        
        local reshape_stubs: list uniq reshape_stubs
        and then follow it with your reshape command:
        Code:
        reshape long `reshape_stubs', i(calc_suid) j(person)

        Comment

        Working...
        X