Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Endogeneity due to measurement errors with limited dependent variable

    Dear all,

    For my research, I'm evaluating the effect of a proxy variable of abilities (i.e. a standard IQ test) on three limited dependent variables, namely: a dichotomous variable, a discrete variable that goes from 0 to 11, and a continuous variable that ranges from 0 to 1.

    Of course, this IQ test might not be an error-free measure of abilities. That is why I would like to instrument it and see if I get something interesting.

    However, since I'm considering limited dependent variables as a left-hand-side variable, I am not sure if there is some framework similar to IV using OLS that might be suitable for my setting.

    I would be pleased if someone can give some advice on this.

    Thanks!



    Last edited by Gaston Fernandez; 01 Apr 2020, 14:28.

  • #2
    You will increase your chances of useful answer by following the FAQ on asking questions-provide Stata code in code delimiters, readable Stata output, and sample data using dataex.

    You may find this helpful. http://www.stata.com/meeting/germany...ukker_gsem.pdf

    Comment


    • #3
      Thanks for your reply, Phil. I will look at what you suggested.

      Thanks for your advice as well.

      So, as I mentioned above, I have three dependent variables: a dichotomous variable (named garp), a discrete variable that goes from 0 to 11 (named vgarp), and a variable that ranges from 0 to 1 (named afriat). See below a summary of these variables.

      Code:
      sum garp vgarp afriat
      
          Variable |        Obs        Mean    Std. Dev.       Min        Max
      -------------+---------------------------------------------------------
              garp |        206     .538835    .4997039          0          1
             vgarp |        206    1.946602    2.741206          0         11
            afriat |        206     .949095    .1174611     .33333          1
      The goal is to analyze the effect (if any) of a proxy variable of cognitive abilities (i.e. a standard IQ test) on these three dependent variables. My independent variable, named crt, is defined as the total number of correct answers in the test. See below.

      Code:
      tab crt, m
      
        number of |
          correct |
          answers |      Freq.     Percent        Cum.
      ------------+-----------------------------------
                0 |         56       27.18       27.18
                1 |         40       19.42       46.60
                2 |         46       22.33       68.93
                3 |         64       31.07      100.00
      ------------+-----------------------------------
            Total |        206      100.00

      Therefore, and since I am considering in my analysis limited dependent variables, I am wondering if there is any method for such a setting, with which I could try to assess the potential bias in crt due to measurement error.

      Any suggestions will help.

      Thanks!

      Comment


      • #4
        Hi Gaston, as a starting point, you may consider -eprobit- for garp; and -eoprobit-, -eintreg-, or -ivpoisson- for vgarp, depending on the informational content of that variable. The last variable, afriat, seems to be a fractional outcome, and I do not know whether there is a Stata command that allows one to estimate a fractional response model with endogenous regressors; if there is none, you may consider writing a simple program that applies -fracreg- with -bootstrap- and the control function approach described in Jeff Wooldridge's Econometric Analysis of Cross Section and Panel Data (2nd ed.).

        P.S.: Just noticed that your endogenous regressor is better modelled as a count variable instead of a continuous variable. That may complicate things a lot, I guess someone more familiar with this type of application can provide better suggestions.
        Last edited by Hong Il Yoo; 03 Apr 2020, 07:25.

        Comment


        • #5
          Thanks for your answer, Hong. It is really helpful.

          About your comment
          -eoprobit-, -eintreg-, or -ivpoisson- for vgarp, depending on the informational content of that variable
          please, look at the distribution of vgarp:

          Code:
                vgarp |      Freq.     Percent        Cum.
          ------------+-----------------------------------
                    0 |        111       53.88       53.88
                    1 |          5        2.43       56.31
                    2 |         31       15.05       71.36
                    3 |         11        5.34       76.70
                    4 |          7        3.40       80.10
                    5 |         18        8.74       88.83
                    6 |          8        3.88       92.72
                    7 |          3        1.46       94.17
                    8 |          2        0.97       95.15
                    9 |          5        2.43       97.57
                   10 |          2        0.97       98.54
                   11 |          3        1.46      100.00
          ------------+-----------------------------------
                Total |        206      100.00
          Thanks again.
          Last edited by Gaston Fernandez; 03 Apr 2020, 07:34.

          Comment


          • #6
            I think it would be helpful if you could tell us more about what 0,1,2,3,..., 11 refer to. Is it a count of something (e.g. the number of correct answers in an exam)? Or an ordinal outcome (e.g. 0 for very unhappy, 1 for slightly unhappy, and so on)? Or an interval outcome (e.g. 0 for income in $0 to $500, 1 for income in $501-$1000, and so on)?

            Comment


            • #7
              Thanks for your question.

              The variable vgarp refers to the number of violations of revealed preferences axioms (i.e. GARP) while choosing between consumption alternatives. For example, a value of vgarp = 0, refers to 0 violations, whereas vgarp = 11 refers to 11 violations. So in fact, it is just counting the number of events (i.e. violations) for each individual in my sample.

              Comment


              • #8
                Thanks, in that case, I think -ivpoisson- is more natural than -eoprobit- and -eintreg-. You're also likely to find a lot of useful information from previous threads on count data models with endogenous regressors.

                Comment


                • #9
                  Thanks a lot for your suggestions.

                  Comment

                  Working...
                  X