Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression involving all dummy variables

    Hello
    I have this dataset with all dummy variables based on a social media experiment. I am trying to test the significance of three independent dummy variables on a dependent dummy variable.
    A follow request was sent to 300 users out of which a guiltmessage was sent to 100 users and a publicly-viewable comment was made on pictures of the 100 users. The remaining 100 users were not subjected to these two treatments but only got requests.
    I am trying understand the impact of these two treatments on var: response which is whether the user followed back the requesting user.

    What would be the best methodological way to go about the regression?

    Thanks
    Smriti

  • #2
    Smriti:
    welcome to this forum.
    Left-hand side of your regression: If your dependent variable is binary (0/1), consider -logit- or -logistic-.
    Right-hand side of your regression: you can probably consider an unique categorical variable with three levels (one level for each of the methods used by investigators); see-fvvarlist- for an efficient and effective way to create categorical variables and interactions.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Hello Carlo

      Thank you so much for the fast response. I tried doing it but I cannot understand the interpretation of the first variable. (Screenshot attached)
      3 and 4 are guiltmessage and publiccomment respectively and 1 was request.
      Is it something to do with the fact that this is a complete dummy variable regression?

      Click image for larger version

Name:	logit_exp.PNG
Views:	1
Size:	16.4 KB
ID:	1465441


      Without creating a categorical variable, the regression looks like this (screenshot attached) where at least one dependent variable gets omitted due to collinearity.
      Click image for larger version

Name:	logitexp2.PNG
Views:	2
Size:	18.8 KB
ID:	1465442



      Attached Files

      Comment


      • #4
        Correction : *at least one independent variable

        Comment


        • #5
          Smriti:
          regardless the approach your coefficents are absolutely the same.
          As you can see, -fvvarlist- omits by default one level of the predictor in order to shelter you from the so called dummy trap (https://en.wikipedia.org/wiki/Dummy_...le_(statistics)).
          However, when you inadvertently included all your "hand-made" categorical variables in the right-hand side of your regression, Stata takes control and omits one of them due to collinearity (another way to shelter you from the dummy trap pitfall).
          As far as the interpretation of your results is concerned, users who received a gult message are more likely to send back a response.
          The -cons of your regression refers to the omitted level of your catgorical variable (ie, request), which seems less likely to cause a response.
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #6
            thank you so much Carlo.

            Comment

            Working...
            X