Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Need help with multiple response question - to code into fixed categorical variables.

    For our research we (fatefully) allowed a multiple choice question for degree funding (how the individual funds their studies).
    Our options were: - Student finance - Family - Savings - Other - Scholarship - Bursary.

    As displayed here with this tab:

    . label list funding
    funding:
    1 Bursaries, Savings, Scholarships
    2 Family
    3 Family, Bursaries, Scholarships
    4 Family, Other
    5 Family, Savings
    6 Family, Scholarships
    7 Family, Student Finance
    8 Family, Student Finance, Bursaries
    9 Family, Student Finance, Bursaries, Savings
    10 Family, Student Finance, Bursaries, Scholarships
    11 Family, Student Finance, Other
    12 Family, Student Finance, Savings
    13 Family, Student Finance, Savings, Other
    14 Scholarships
    15 Student Finance
    16 Student Finance, Bursaries
    17 Student Finance, Bursaries, Other
    18 Student Finance, Bursaries, Savings
    19 Student Finance, Bursaries, Savings, Other
    20 Student Finance, Bursaries, Scholarships
    21 Student Finance, Other
    22 Student Finance, Savings
    23 Student Finance, Savings, Other

    Given people have clicked more than one, its very tricky to input into stata. We have been advised to ignore “other", "savings" and “bursary”, and generate new variables covering:
    - Student finance; anyone who clicked “student finance” as an option for funding, no matter what else they clicked (unless scholarship)
    - Family; people who have said “family” and NOT student finance
    - Scholarship; anyone who mentions scholarships, no matter anything else they have mentioned.

    Does anyone have any advice on how to generate a new variable for funding but with the three categories - family, student finance and scholarships.

    P.s this is a university project therefore I just need it to work and provide me with some results I can use for the analysis.

    I really appreciate whatever suggestions you may give for me to try.

    I have also destrung the variable hence the above layout

  • #2
    Elisha:
    welcome to this forum.
    You may want to take a look at -help recode-.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      "Need help" is not necessary! It applies to any question.

      Detailed replies to this would have come faster with a data example.

      I made a little dataset out of your labels. Then

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input float funding str48 what
       1 "Bursaries, Savings, Scholarships"                
       2 "Family"                                          
       3 "Family, Bursaries, Scholarships"                
       4 "Family, Other"                                  
       5 "Family, Savings"                                
       6 "Family, Scholarships"                            
       7 "Family, Student Finance"                        
       8 "Family, Student Finance, Bursaries"              
       9 "Family, Student Finance, Bursaries, Savings"    
      10 "Family, Student Finance, Bursaries, Scholarships"
      11 "Family, Student Finance, Other"                  
      12 "Family, Student Finance, Savings"                
      13 "Family, Student Finance, Savings, Other"        
      14 "Scholarships"                                    
      15 "Student Finance"                                
      16 "Student Finance, Bursaries"                      
      17 "Student Finance, Bursaries, Other"              
      18 "Student Finance, Bursaries, Savings"            
      19 "Student Finance, Bursaries, Savings, Other"      
      20 "Student Finance, Bursaries, Scholarships"        
      21 "Student Finance, Other"                          
      22 "Student Finance, Savings"                        
      23 "Student Finance, Savings, Other"                
      end
      
      . levelsof funding if strpos(what, "Finance"), sep(,) local(Finance)
      7,8,9,10,11,12,13,15,16,17,18,19,20,21,22,23
      
      . levelsof funding if strpos(what, "Family") & !strpos(what, "Finance"), sep(,) local(Family)
      2,3,4,5,6
      
      . levelsof funding if strpos(what, "Scholarship"), sep(,) local(Scholarship)
      1,3,6,10,14,20
      That suggests code like this, which I can't test without a data example.

      Code:
      gen wanted = 1 if inlist(funding, `Family')
      replace wanted = 2 if inlist(funding, `Finance')
      replace wanted = 3 if inlist(funding, `Scholarship')
      label def wanted 1 Family 2 Finance 3 Scholarship
      label val wanted wanted 
      You can get what using decode. Or just use the results like this:

      Code:
      gen wanted = 1 if inlist(funding, 2,3,4,5,6)
      Last edited by Nick Cox; 09 Mar 2019, 02:49.

      Comment


      • #4
        Thank you! Sorry I never attached a data example (I dont know how to do this).

        To an extent this worked, so I appreciate you taking to time out to help me, having done this it removes this from my overall data set which more or less means that the data is invalid and cannot be used if that makes sense

        . tab funding

        . tab funding

        How do you fund your maintenance costs?
        Please select all that apply. Freq. Percent Cum.

        . tab funding

        How do you fund your maintenance costs? |
        Please select all that apply. | Freq. Percent Cum.
        ----------------------------------------+-----------------------------------
        Bursaries, Savings, Scholarships | 1 0.88 0.88
        Family | 43 37.72 38.60
        Family, Bursaries, Scholarships | 1 0.88 39.47
        Family, Other | 1 0.88 40.35
        Family, Savings | 3 2.63 42.98
        Family, Scholarships | 1 0.88 43.86
        Family, Student Finance | 20 17.54 61.40
        Family, Student Finance, Bursaries | 2 1.75 63.16
        Family, Student Finance, Bursaries, Sav | 2 1.75 64.91
        Family, Student Finance, Bursaries, Sch | 1 0.88 65.79
        Family, Student Finance, Other | 4 3.51 69.30
        Family, Student Finance, Savings | 5 4.39 73.68
        Family, Student Finance, Savings, Other | 2 1.75 75.44
        Scholarships | 1 0.88 76.32
        Student Finance | 8 7.02 83.33
        Student Finance, Bursaries | 7 6.14 89.47
        Student Finance, Bursaries, Other | 2 1.75 91.23
        Student Finance, Bursaries, Savings | 3 2.63 93.86
        Student Finance, Bursaries, Savings, Ot | 1 0.88 94.74
        Student Finance, Bursaries, Scholarship | 1 0.88 95.61
        Student Finance, Other | 3 2.63 98.25
        Student Finance, Savings | 1 0.88 99.12
        Student Finance, Savings, Other | 1 0.88 100.00
        ----------------------------------------+-----------------------------------
        Total | 114 100.00



        This corresponds with the numbers:

        1 "Bursaries, Savings, Scholarships"
        2 "Family"
        3 "Family, Bursaries, Scholarships"
        4 "Family, Other"
        5 "Family, Savings"
        6 "Family, Scholarships"
        7 "Family, Student Finance"
        8 "Family, Student Finance, Bursaries"
        9 "Family, Student Finance, Bursaries, Savings"
        10 "Family, Student Finance, Bursaries, Scholarships"
        11 "Family, Student Finance, Other"
        12 "Family, Student Finance, Savings"
        13 "Family, Student Finance, Savings, Other"
        14 "Scholarships"
        15 "Student Finance"
        16 "Student Finance, Bursaries"
        17 "Student Finance, Bursaries, Other"
        18 "Student Finance, Bursaries, Savings"
        19 "Student Finance, Bursaries, Savings, Other"
        20 "Student Finance, Bursaries, Scholarships"
        21 "Student Finance, Other"
        22 "Student Finance, Savings"
        23 "Student Finance, Savings, Other"



        .Above is the variable that I hope to organise into the 3 subgroups for funding; Finance, Family and Scholarship.
        When I run the code:
        gen wanted = 1 if inlist(funding, `Family')

        It comes up with invalid stynax which I believe is to do with the term 'wanted', I am however not sure on how I can rephrase this to allow me to successfully manipulate the data into the categories. While also allowing me to use this variable with the rest of the dataset to look at certain relationships etc.

        Comment


        • #5
          If you don't know how to give data examples, then you have yet to read the FAQ Advice thoroughly, which you are asked to do every time you post.

          having done this it removes this from my overall data set which more or less means that the data is invalid and cannot be used if that makes sense
          Sorry, but you need to read #3 backwards in terms of what you have to do. It's written forwards to explain the principles leading up to advice on what you do. It is better that you understand the code as far as possible.

          I created a a little dataset because you didn't provide a data example. You don't need to do that because you have the data.

          The local macros such as Family will only exist if you run levelsof in the way I illustrated, but I gave you another way to do it. (And you need a variable what, which as I explained you can get with
          Code:
          decode
          .)

          To expand slightly on my last suggestion, this should work with your dataset in memory.

          Code:
          gen wanted = 1 if inlist(funding, 2,3,4,5,6)  
          replace wanted = 2 if inlist(funding, 7,8,9,10,11,12,13,15,16,17,18,19,20,21,22,23)  
          replace wanted = 3 if  inlist(funding, 1,3,6,10,14,20)  
          label def wanted 1 Family 2 Finance 3 Scholarship  
          label val wanted wanted

          Comment

          Working...
          X