Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    how can i define a panel data with multiple observations for the same year - student
    You can't. You could just declare -xtset StID- with no time variable; and that will enable you to do most panel estimations. But it will not allow you to deal with autoregressive structure.

    What keeps you from doing -xtset St_ID Ano- is that in each student-year combination you have multiple observations, distinguished by the variable Disciplina. I don't know if there are other variables in your real data set. I imagine there are, and I don't know what complications, if any, they will present for you. But if your data set consists of only the variables you have shown, you can do this:

    Code:
    by St_ID Ano, sort: gen _j = _n
    reshape wide Disciplina, i(St_ID Ano) j(_j)
    xtset St_ID Ano
    This will transform your data set into one where there is only one observation per student-year, and the various values of Disciplina spread out over multiple separate variables Disciplina1 Disciplina2, etc. If you have other variables like Disciplina which vary within St_ID Ano combinations, then you will have to add them to the varlist in the -reshape wide- command as well. Not knowing more about your data, however, I cannot be sure whether this layout will prove workable for you.

    Comment


    • #17
      Clyde, I need your help once again.

      I am trying to run a bivariate probit model, and in order to do that I need to split my depend variable (transitioning) into two new ones, one related to the first transition (grade 1 to 2) and other to tghe second transition (grade 2 to 3) but i am having difficulties on that, so I came to you once again.

      First I ran two separate probit modesl for each grade (serie 1 and 2)

      Code:
      probit transitioning i.genero ib1.raca i.zonares bolsa lnpib_pc i.turno transfer faltas alunos_turma biblio func_UE prof_idade  if serie ==1
      
      probit transitioning i.genero ib1.raca i.zonares bolsa lnpib_pc i.turno transfer  faltas alunos_turma biblio func_UE prof_idade if serie ==2
      And because the residuals the ron those equation might be correlated i need to run a bivariate model. So I need to create these 2 new grade specific variables.

      So I could have somethng like these:

      Code:
      biprobit transitioning_1 transitioning_2 i.genero ib1.raca i.zonares bolsa lnpib_pc i.turno transfer  faltas alunos_turma biblio func_UE prof_idade
      Thank you once again.

      Max

      Comment


      • #18
        So I think you want:

        Code:
        gen transitioning_1 = transitioning & serie == 1
        gen transitioning_2 = transitioning & serie == 2

        Comment


        • #19
          Thank you!!! I ran the bivariate!!!

          Comment


          • #20
            Max: Clyde is ever willing to help but a tacit principle in any forum I know is that answering a question in a thread doesn't commit anyone to offering indefinite support, even on that thread.

            It would be best to phrase any future questions as directed generally at anyone who might be able and willing to help, rather than putting even the gentlest of pressure on any individual.

            Comment


            • #21
              Nick: I never thought of that and also i didn`t think about indefinite support. I said his name because he was the one replying me back.

              But no worries. I will be more general next time. Thank you for the advice.

              Comment


              • #22
                To all stata users,

                Once again I came to you guys for help.

                I have been struggling to generate a new variable that gives me for each year (Ano) and school (UE) the ratio between the total quantity of St_ID enrolled at that UE divied by the nymber of employees (emp_UE).

                For example, I need to count the quantity of St_IDs at UE == 5355 in year (Ano ==2010) and divided by func_UE in order to get the ratio for that specific year.

                The reason why I need this new variable is because the absolute value of emp_UE is giving me some strange magnitude results on all my estimations.


                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input double(Ano St_ID UE serie Disciplina emp_UE)
                2010 116   5355 2 301 37
                2009 116   5355 1 301 37
                2011 116   5355 3 301 37
                2011 116   5355 3 401 37
                2010 116   5355 2 401 37
                2009 116   5355 1 401 37
                2010 140   6181 3 301 62
                2009 140   6181 2 301 62
                2008 140   6181 1 301 62
                2009 140   6181 2 401 62
                2008 140   6181 1 401 62
                2010 140   6181 3 401 62
                2009 604   5150 1 301 23
                2009 604   5150 1 401 23
                2012 612   5150 3 301 23
                2010 612   5150 1 301 23
                2011 612   5150 2 301 23
                2010 612   5150 1 401 23
                2011 612   5150 2 401 23
                2012 612   5150 3 401 23
                2010 680 102652 3 301 17
                2010 680 102652 3 401 17
                2011 744 102652 1 301 17
                2011 744 102652 1 401 17
                2009 752   2070 1 301 90
                2011 752   2070 2 301 90
                2012 752 102652 2 301 17
                2010 752   2070 1 301 90
                2010 752   2070 1 401 90
                2011 752   2070 2 401 90
                2012 752 102652 2 401 17
                2009 752   2070 1 401 90
                2010 779   3522 2 301 69
                2009 779   3522 1 301 69
                2009 779   3522 1 401 69
                2010 779   3522 2 401 69
                2010 795   2860 2 301 51
                2009 795   2453 1 301 70
                2011 795   2860 3 301 51
                2009 795   2453 1 401 70
                2011 795   2860 3 401 51
                2010 795   2860 2 401 51
                2009 817   3832 1 301 64
                2010 817   3832 2 301 64
                2011 817   3832 3 301 64
                2010 817   3832 2 401 64
                2011 817   3832 3 401 64
                2009 817   3832 1 401 64
                2010 833 102652 2 301 17
                2009 833 102652 1 301 17
                2011 833 102652 3 301 17
                2009 833 102652 1 401 17
                2010 833 102652 2 401 17
                2011 833 102652 3 401 17
                2010 850 102652 2 301 17
                2011 850 102652 3 301 17
                2009 850 102652 1 301 17
                2010 850 102652 2 401 17
                2011 850 102652 3 401 17
                2009 850 102652 1 401 17
                end
                if anyone could help me I will apreciate.

                Max


                Comment


                • #23
                  This sounds like a new question as it doesn't relate to the thread title.

                  Some guesses are needed here. You refer to func_UE but there's no such variable in your example. I guess that you want to avoid multiple counting. Nothing is said in your question about the variables serie Disciplina which I am guessing wildly to be irrelevant (but if so why mention them).

                  This may not be what you want, but it shows some technique.

                  Code:
                  clear 
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input double(Ano St_ID UE serie Disciplina emp_UE)
                  2010 116   5355 2 301 37
                  2009 116   5355 1 301 37
                  2011 116   5355 3 301 37
                  2011 116   5355 3 401 37
                  2010 116   5355 2 401 37
                  2009 116   5355 1 401 37
                  2010 140   6181 3 301 62
                  2009 140   6181 2 301 62
                  2008 140   6181 1 301 62
                  2009 140   6181 2 401 62
                  2008 140   6181 1 401 62
                  2010 140   6181 3 401 62
                  2009 604   5150 1 301 23
                  2009 604   5150 1 401 23
                  2012 612   5150 3 301 23
                  2010 612   5150 1 301 23
                  2011 612   5150 2 301 23
                  2010 612   5150 1 401 23
                  2011 612   5150 2 401 23
                  2012 612   5150 3 401 23
                  2010 680 102652 3 301 17
                  2010 680 102652 3 401 17
                  2011 744 102652 1 301 17
                  2011 744 102652 1 401 17
                  2009 752   2070 1 301 90
                  2011 752   2070 2 301 90
                  2012 752 102652 2 301 17
                  2010 752   2070 1 301 90
                  2010 752   2070 1 401 90
                  2011 752   2070 2 401 90
                  2012 752 102652 2 401 17
                  2009 752   2070 1 401 90
                  2010 779   3522 2 301 69
                  2009 779   3522 1 301 69
                  2009 779   3522 1 401 69
                  2010 779   3522 2 401 69
                  2010 795   2860 2 301 51
                  2009 795   2453 1 301 70
                  2011 795   2860 3 301 51
                  2009 795   2453 1 401 70
                  2011 795   2860 3 401 51
                  2010 795   2860 2 401 51
                  2009 817   3832 1 301 64
                  2010 817   3832 2 301 64
                  2011 817   3832 3 301 64
                  2010 817   3832 2 401 64
                  2011 817   3832 3 401 64
                  2009 817   3832 1 401 64
                  2010 833 102652 2 301 17
                  2009 833 102652 1 301 17
                  2011 833 102652 3 301 17
                  2009 833 102652 1 401 17
                  2010 833 102652 2 401 17
                  2011 833 102652 3 401 17
                  2010 850 102652 2 301 17
                  2011 850 102652 3 301 17
                  2009 850 102652 1 301 17
                  2010 850 102652 2 401 17
                  2011 850 102652 3 401 17
                  2009 850 102652 1 401 17
                  end
                  
                  egen tag = tag(UE Ano) 
                  egen num = total(St_ID * tag), by(UE Ano) 
                  egen den = total(emp_UE * tag), by(UE Ano)
                  gen wanted = num/den 
                  
                  tabdisp UE Ano, c(num den wanted)
                  Code:
                  ------------------------------------------------------------
                            |                       Ano                       
                         UE |     2008      2009      2010      2011      2012
                  ----------+-------------------------------------------------
                       2070 |                752       752       752          
                            |                 90        90        90          
                            |           8.355556  8.355556  8.355556          
                            | 
                       2453 |                795                              
                            |                 70                              
                            |           11.35714                              
                            | 
                       2860 |                          795       795          
                            |                           51        51          
                            |                     15.58823  15.58823          
                            | 
                       3522 |                779       779                    
                            |                 69        69                    
                            |           11.28986  11.28986                    
                            | 
                       3832 |                817       817       817          
                            |                 64        64        64          
                            |           12.76563  12.76563  12.76563          
                            | 
                       5150 |                604       612       612       612
                            |                 23        23        23        23
                            |           26.26087   26.6087   26.6087   26.6087
                            | 
                       5355 |                116       116       116          
                            |                 37        37        37          
                            |           3.135135  3.135135  3.135135          
                            | 
                       6181 |      140       140       140                    
                            |       62        62        62                    
                            | 2.258065  2.258065  2.258065                    
                            | 
                     102652 |                833       680       744       752
                            |                 17        17        17        17
                            |                 49        40  43.76471  44.23529
                  ------------------------------------------------------------


                  Comment


                  • #24
                    Let me try to explain better.......

                    in school (UE==5355) we have 37 employees working there (emp_UE). What I wouyld like to find out is how many students (ST_ID) are enrolled on UE=5355 on year 2008, 2009, 2010, 2011 and 2012 so I will be able to generate variable ratio_emp_st.

                    Looking at my sample, the amount of students that goes to UE==5355 is just one. So my ratio_emp_st will be 1/37, but my dataset is a lot bigger.

                    Justone ore thing disciplina is subject in portuguese. you are

                    Did I make myself more clear now? if no, pleae, tell me so I try to explain better.

                    Max


                    Comment


                    • #25
                      I think this is what you want:

                      Code:
                      //    CALCULATE NUMBER OF DISTINCT STUDENTS IN EACH SCHOOL IN EACH YEAR
                      by UE Ano (St_ID), sort: gen student_count = sum(St_ID != St_ID[_n-1])
                      by UE Ano: replace student_count = student_count[_N]
                      
                      //    VERIFY EMPLOYEE COUNT IS CONSTANT ACROSS OBSERVATIONS WITHIN SCHOOL X Ano
                      by UE Ano (emp_UE), sort: assert emp_UE[1] == emp_UE[_N]
                      
                      //    CALCULATE RATIO
                      gen student_employee_ratio = student_count/emp_UE

                      By the way, I agree with Nick that this new question is not related to the original thread. In the future please start a new topic when you raise a different question. People who come here in the future searching for answers to a similar question won't be able to find it unless it's on a thread with a related title.

                      Comment

                      Working...
                      X