Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Allright, thanks! I tried something like that but in the the end, Nick Cox proved to be a bit better than I was in writing stata code, so I just his tsspel, but that doens't that much with understanding the logic, of course, which your code above does. Great

    Comment


    • #17
      The logic behind tsspell (SSC) is spelled out in the paper you cited in post #3 in the thread. The small question, if there is one, of why the program is not mentioned in the paper is answered by the fact that the paper was already long enough, without several extra pages on tsspell.

      Comment


      • #18
        I've had this thread in the back of my head for a while because I worry when people start doing arithmetics with observation subscripting. This is usually not needed, in the same way that looping over observations is generally not needed. Here's an even simpler solution for the original problem

        Code:
        clear
        input id year occ_code unempl
        1 1999 4 0
        1 2000 4 0
        1 2001 . 1
        1 2002 . 1
        1 2003 . 1
        1 2004 5 0
        1 2005 5 0
        1 2006 5 0
        1 2007 . 1
        2 1999 . 1
        2 2000 . 1
        2 2001 . 1
        2 2002 2 0
        2 2003 2 0
        2 2004 . 1
        2 2005 2 0
        2 2006 2 0
        2 2007 . 1
        3 1999 1 0
        3 2000 1 0
        3 2001 . 1
        3 2002 1 0
        3 2003 . 1
        3 2004 2 0
        3 2005 2 0
        3 2006 3 0
        3 2007 . 1
        4 1999 1 0
        4 2000 2 0
        4 2001 3 0
        4 2002 4 0
        4 2003 . 1
        4 2004 4 0
        4 2005 3 0
        4 2006 . 1
        4 2007 3 0
        5 1999 1 0
        5 2000 2 0
        5 2001 3 0
        5 2002 4 0
        5 2003 . 1
        5 2004 4 0
        5 2005 3 0
        5 2006 . 1
        5 2007 3 0
        6 2005 15 0
        6 2006 . 1
        6 2007 16 0
        end
        
        * verify assumptions about the data
        isid id year, sort
        
        * Tag observations that start a new job
        by id: gen newjob = unempl == 0 & unempl[_n-1] == 1
        
        * Discard unemployment obs
        drop if unempl
        
        * Note the previous occupation and reduce to new job observations
        by id: gen old_occ = occ_code[_n-1]
        keep if newjob
        
        list id year old_occ occ_code, sepby(id) noobs
        
        * Calculate the frequency for each transition
        collapse (count) occ_=year, by(old_occ occ_code)
        
        * Create a cross-tab
        list, sepby(occ_code) noobs
        reshape wide occ_, i(old_occ) j(occ_code)
        mvencode _all, mv(0)
        list

        Comment


        • #19
          I worry when people start doing arithmetics with observation subscripting.
          And why?

          Comment


          • #20
            I worry in the sense that if I don't propose a more Stataish way, some people on Statalist will probably pick-up on the idea and start looping over observations or do observation arithmetic when there are far simpler ways to achieve the same results. Don't forget the OP's comment: "But damn, I had to strain my head in order to understand the idea of using hardbrackets within hard brackets!".

            Comment


            • #21
              Hello Robert,

              I see your point. But just to put that clear: my "using hardbrackets within hard brackets" - bit of syntax has nothing to do with looping over oberservations. Rather it is a way to determine the x of _n-x if you have to reference a value in another record where the relative position of that record is determined by a variable. Usually one refers to _n - #, but here I referred to _n-x with x being the value of temp[_n-1].
              Your far simpler solution works when it is allowed to drop observations. My less Stataish way (?? whatever that means) is more general, as it allows to "jump" over the records you dropped.
              But lets not start a competition on who has the better solution. There are often several solutions, and it is a matter of taste or of 'programming style' which one appeals more to someone.

              No offense meant! (I looked that one up in leo.org - no idea if it says really what I intend to say )
              Greetings, Klaudia

              Comment


              • #22
                I understand that observation index arithmetics is not the same as looping and it certainly doesn't have the execution time penalties that are involved with looping over observations but the approach still stems from a reflex (probably inherited from the use of other computer languages) to target an individual observation via its computed position. My point is that almost every time you think of a solution that involves index arithmetics, you have a simpler solution that uses basic Stata commands.

                Even though the end game in the example in this thread requires destroying the original data, you can still make the required frequency computations on the full sample without index arithmetics. Here's a reworked example

                Code:
                clear
                input id year occ_code unempl
                1 1999 4 0
                1 2000 4 0
                1 2001 . 1
                1 2002 . 1
                1 2003 . 1
                1 2004 5 0
                1 2005 5 0
                1 2006 5 0
                1 2007 . 1
                2 1999 . 1
                2 2000 . 1
                2 2001 . 1
                2 2002 2 0
                2 2003 2 0
                2 2004 . 1
                2 2005 2 0
                2 2006 2 0
                2 2007 . 1
                3 1999 1 0
                3 2000 1 0
                3 2001 . 1
                3 2002 1 0
                3 2003 . 1
                3 2004 2 0
                3 2005 2 0
                3 2006 3 0
                3 2007 . 1
                4 1999 1 0
                4 2000 2 0
                4 2001 3 0
                4 2002 4 0
                4 2003 . 1
                4 2004 4 0
                4 2005 3 0
                4 2006 . 1
                4 2007 3 0
                5 1999 1 0
                5 2000 2 0
                5 2001 3 0
                5 2002 4 0
                5 2003 . 1
                5 2004 4 0
                5 2005 3 0
                5 2006 . 1
                5 2007 3 0
                6 2005 15 0
                6 2006 . 1
                6 2007 16 0
                end
                
                * verify assumptions about the data
                isid id year, sort
                
                * Tag observations that start a new job
                by id: gen newjob = unempl == 0 & unempl[_n-1] == 1
                
                * move unemployment obs out of the way
                sort unempl id year
                
                * Note the previous occupation and reduce to new job observations
                by unempl id: gen old_occ = occ_code[_n-1]
                
                * Calculate the frequency for each transition on full data
                bysort old_occ occ_code: egen occ_ = total(newjob)
                
                * Build the cross-tab
                by old_occ occ_code: keep if _n == 1
                drop if occ_ == 0
                keep old_occ occ_code occ_
                reshape wide occ_, i(old_occ) j(occ_code)
                mvencode _all, mv(0)
                list

                Comment


                • #23
                  Nick Cox, I am aware that the logic is spelled out (!) in your journal article, and I will return to it ind order to grasp it fully. However, since, as Klaudia notes, fixing the problem with counting "across" panels actually didn't have any practical implication for the end result, I choose to just use your .ado file without understanding the finer details in this particular logic, since I needed to move on, but I do want to "get it", so I'll return to the article later, I want to know everything there is to know :D

                  Comment


                  • #24
                    Hi again, just wanted to let you know the result of the help you gave me, which I learned tremendeously from. Here is a map of the job-to-job mobility for *all* unemployed people in denmark during 1996-2009. You most likely - except you Klaudia - won't understand the labels for the different job types, but regardless.. it's very pretty! And thanks again. (this is made in R with ggplot2, and is still a work in progress)

                    https://www.dropbox.com/s/9chg5ztmqv...e.150.pdf?dl=0

                    and

                    https://www.dropbox.com/s/etr6di9c00...e.150.pdf?dl=0

                    Comment

                    Working...
                    X