Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recoding variables >0 and <0

    Hi there,

    I'm looking into school standards and want to analyse schools which oversubscribed and undersubscribed. I've made a variable to count this with:

    gen capacity_pupils=pupils-capacity //so a school with 500 pupils but a capacity of 400 would get a score of +100 (and vice versa)//

    However I'm not at the moment interested with this result as a categorical variable rather than continuous (0 for undersubscribed and 1 for over). So I wanted to recode all my variables in capacity_pupils >0=1 and <0=0 but Stata shouts at me when I try to do that! I've managed to split them apart with:

    gen oversub=capacity_pupils if capacity_pupils>0
    gen undersub=capacity_pupils if capacity_pupils<0

    But now I can't merge them together properly. Has anyone got any ideas of how to do this?

    Appreciate any help you can give
    Cheers
    Max




    Click image for larger version

Name:	Screenshot 2019-10-23 at 11.58.08.png
Views:	1
Size:	28.6 KB
ID:	1521600


  • #2
    sorry that should read: "However I'm at the moment only interested with this result as a categorical variable rather than continuous"

    Comment


    • #3
      please read the FAQ on how to present data and ask questions; I believe that the following will do what you want:
      Code:
      gen oversub=capacity_pupils>0
      replace oversub=. if capacity_pupils==.
      you don't say what you want if capacity_pupils is exactly 0 so I have just used part of your code (and they will now have a value of 0 in the new variable)

      Comment


      • #4
        Your code and your intent are shouts for different teams here.


        Code:
        gen oversub=capacity_pupils if capacity_pupils>0
        gen undersub=capacity_pupils if capacity_pupils<0
        creates two new variables. If you want to put them together then you will just get back to the original (assuming that any zeros get put back too).

        A categorical variable is easier with say

        Code:
        gen wanted = sign(capacity_pupils)
        which maps negative values to -1, zeros to 0, positive values to 1.

        Although zeros aren't visible in your screenshot (NB FAQ Advice #12 explains that data examples are much better than screenshots) they aren't ruled out by your definition and your code is deficient in implying that they are impossible.

        Comment


        • #5
          Thanks both! Yes, apologies for the bad formatting (I'm new here)

          Cheers for your help, that's fixed the problem
          Max

          Comment


          • #6
            To add a general comment to Rich's suggestion, since missing are coded as extremely large, it is often a good idea to do greater than as
            gen oversub=(capacity_pupils>0) & (capacity_pupils<.) The same works if you want to use them in an if gen oversub=capacity_pupils if (capacity_pupils>0) & (capacity-pupils<.) Note logicals on the rhs equate to 0/1 variables. So you could also do gen oversub=(capacity_pupils>0) * (capacity_pupils<.) The parentheses are often not needed, but I find they sometimes help me keep clear on the logic. Getting "true" for > comparisons in Stata when you have missing data is something you must watch for every time you have > .

            Comment

            Working...
            X