Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • create dummy variables for every certain range of values of another variable

    Dear all,


    My problem is that I want to create dummy variables for different brackets of my wage variable, say, I want to create a dummy for wages less than 10 £, a dummy for wages between 11 and 20 , a dummy for wages between 21 and 30, .. and so on up to wages equal to say 5000. Since this is a large number of dummies that I need to generate.

    I looked for the following thinking my problem is the same, https://www.stata.com/statalist/arch.../msg00362.html here in this I tried Nick Cox solution thinking that I can extend it to any number of required dummies. However, when I extended it for up to 4 it gave me invalid syntax. I tried the following:
    Code:
    bysort district: gen dummy = cond(inrange(fwage,0,10), 0, cond(inrange(fwage,11,20), 1,cond(inrange(fwage,21,30), 2, cond(inrange(fwage,31,40), 3, cond(inrange(fwage,41,50), 4 ))))) if (seq01==1|seq02==1|seq03==1|seq04==1|seq11==1|seq15==1)
    the if condition at the end of the code is whether someone is in labour force or not. Different seq are questions giving information about working, looking for work etc.

    Another thing I want to do apart from creating these dummies is that I want employment count in each of the bins above mentioned. For example how many people are working in the first bin ( whose wage is between 0 and 10), in the 2nd bin ( between 11 and 20) and so on...
    Can someone guide in this respect?
    Thanking you all in anticipation.


    Zahid

  • #2

    Code:
    , 4
    needs to be

    Code:
    , 4, .
    This isn’t a dummy variable in the usual sense, but an ordered categorical variable, but once you have it

    Code:
    tab dummy

    Comment


    • #3
      Thank you very much, it worked now. Yes, you are right I wanted this kind of categorical variable. Is there any other way apart from
      Code:
      tab dummy
      which can give me a total of each of these categories in the form of a variable for each district. I tried the following two:
      Code:
      bysort district dummy : egen totaldummy= total(dummy)
      and
      Code:
      bysort district dummy : egen totaldummy= sum(dummy)
      but these two actually add the numbers given to each category ( e.g. if there are 15 observations in the category of 0 it adds those 15 "0" and gives 0, if there are 15 observations in the category of 2 it adds them to 30, if there are 15 observations in category of 3 it adds those 15 and gives 45, and so on) rather than counting the number of observations in each category and then summing them for me.

      Comment


      • #4
        Why not

        Code:
        tab district dummy
        ?

        Also,

        Code:
        bysort district dummy: gen freq = _N

        Comment


        • #5
          Thank you very much, Sir, again. I am kind of a beginner in Stata and
          Code:
          tab district dummy
          gives me results in the output window ( which I was unable to get as a separate variable) but I wanted this total number as a separate variable for further analysis. But now
          Code:
           bysort district dummy: gen freq = _N
          worked perfectly as I wanted it. Thanks a lot, Sir.

          Comment

          Working...
          X