Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cluster multiple dummy variables

    Hi everyone,

    In my dataset there is a categorical variable "nationality" that takes on 36 values, depending on the Country of origin of my observations. I therefore created 35 dummy variables that take value 1 for each country and 0 otherwise.

    How do I carry out a Probit regression in STATA without having listed the coefficients of all these dummies? Should I cluster the (n-1) dummy variables and, if so, how can I do that?

    Thanks for any advice you might have on this!

  • #2
    Ludovica:
    - why creating catagorical variables yourself when -fvvarlist- can do it for you?
    - the second part of your query seems a bit foggy. Stata reports the coefficients for the predictors you plugged in. Besides, clustering on n-1 categorical variables is something I've never heard about.

    As per FAQ, you would be better off posting what you typed and what Stata gave you back via CODE delimiters and/or sharing and example/excerpt of your data via -dataex- Thanks.
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Hello! This question might not be perfectly related to this thread, but I do have a question about multiple dummy variables.


      I have 100 dummies (each dummy for a month; named 'dummy1' 'dummy2' ... 'dummy100'), and I need to multiply these dummies by another simple dummy (named 'treat') in order to create a new variable named 'mult'.

      I was wondering if there is a way to simplify this multiplication.
      I mean I do not want to code 100 times: mult1 = treat*dummy1, then mult2 = treat*dummy2, ...
      I tried to code as follows: mult = treat * dummy*, but Stata says this is the invalid syntax.

      Could you please help?

      Thank you.

      Comment


      • #4
        You also should do as Carlo suggests and read the output of help fvvarlist and then return to the categorical variable from which you created your 100 mothly dummies - suppose is is called "month" - and in your model include these with syntax like
        Code:
        regress y i.month##i.treat
        rather than
        Code:
        regress y month1 month2 ... month100 treat mult1 mult2 ... mult100.

        Comment


        • #5
          Katherine:
          take a look at -help foreach-.
          Kind regards,
          Carlo
          (Stata 18.0 SE)

          Comment


          • #6
            Katherine:
            you may want to try something along the following lines:
            Code:
            foreach var of varlist dummy* {
            g `var'_X=`var'*treat 
            }
            Kind regards,
            Carlo
            (Stata 18.0 SE)

            Comment


            • #7
              Thank you everyone!

              I was wondering if I could also solve my problem in the following way

              *generate 100 dummy variables 'dummy'
              qui tab month, gen(dummy)

              *regress (without creating that additional variable 'mult'):
              reg y treat i.treat#i.dummy*

              It seems that Stata does not say anything about invalid syntax now.

              Comment


              • #8
                *generate 100 dummy variables 'dummy'
                qui tab month, gen(dummy)

                *regress (without creating that additional variable 'mult'):
                reg y treat i.treat#i.dummy*
                In which case you could likely do the following.
                Code:
                *regress (without creating 100 dummy variables and 100 mult variables):
                reg y i.treat i.treat#i.month
                Do read the output of help factor variables for a better understanding of how to use them instead of creating dummy variables.

                Comment


                • #9
                  Hello,

                  I have a very "beginner-like" question about creating dummies in Stata.


                  Suppose I have 3 treatment groups and one control group. I need to do a regression containing dummies for each of the three treatments.

                  So, I created a dummy for each of the treatment groups:
                  forvalues i= 1(1)3 {
                  gen treat_`i'=1 if treat==`i'
                  replace treat_`i' = 0 if treat!=`i'
                  }


                  Now, my question.

                  Should I put all my three dummies into the regression? I.e.
                  reg y treat_1 treat_2 treat_3

                  Or, should I omit one of the dummies? I.e.
                  reg y treat_2 treat_3

                  Thank you!

                  Comment


                  • #10
                    Helen Brooks - You ask

                    Should I put all my three dummies into the regression? I.e.
                    reg y treat_1 treat_2 treat_3

                    Or, should I omit one of the dummies? I.e.
                    reg y treat_2 treat_3
                    Neither. You should read the output of help factor variables and section 11.4.3 of the Stata User's Guide PDF included with your Stata installation and accessible from Stata's Help menu. Your effort will be amply repaid with an understanding of how to use factor variables instead of creating dummy variables. Your code should be
                    Code:
                    reg y i.treat
                    You didn't tell us, but I am assuming your control group has treat==0, which seems likely.
                    Last edited by William Lisowski; 15 Feb 2019, 09:06.

                    Comment


                    • #11
                      Helen:
                      as an aside to William's excellent advice, please note that you should not include all the dummies as predictors: otherwise you will stumble upon the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).
                      One of the reasons (among many others) why you should forget about creating categorical variables (and interaction by hand) and switch permanently to -fvvarlist- notation is that -fvvarlist- shelters you from dummy trap.
                      Kind regards,
                      Carlo
                      (Stata 18.0 SE)

                      Comment


                      • #12
                        Thank you for your replies, Carlo and William. I should have studied factor variables more carefully...

                        Comment

                        Working...
                        X