Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression output incomplete.

    Good everyone. I am running a regression sorted by the variable "pov_status" which has 4 levels. My output brings only display 3 levels. What am I not doing correctly?

    Below is my code:

    eststo clear
    sort pov_status
    quietly by pov_status: eststo: quietly regress pvtng i.notworking_cvd##i.mcdexpand2020 ///
    i.notworking_non##i.mcdexpand2020 white_nh black_nh other_nh hispanic ///
    age_u20 age2026 age2730 age3135 age4145 age4650 age5155 age5660 age6164 age65pl ///
    lesshs hsged hsbelow socol college under25k income25_50k income75_100k income100pl ///
    childless child_1 child_2 child_3
    esttab using 2_digit.csv, label nodepvar nonumber cells("b(fmt(2))" "sd(par fmt(2))" ) replace


    Thanks.

  • #2
    It isn't possible to say based on what you show. I suspect that for one of the levels of pov_status there is a problem and the regression never actually happens. But just which it is and what goes wrong cannot be known when all of the output is suppressed by -quietly-.

    I suggest you re-run this, but remove all the -quietly-s. That way you will be able to see the process as it unfolds. Somewhere in all the output I expect you will find some error message that explains what has gone awry.

    Comment


    • #3
      Thanks Clyde. I reran the regression removing all the "quietly". It ran well with out control variables. Once I include my control variables, I get an error message for the 4th category stating either missing or insufficient observations. A tabulation of the variables show that there are about 800,000 observations under the 4th category. How can I rectify this?

      Thanks

      Comment


      • #4
        Tolulope:
        can you palease share and excerp/example of your dataset via -dataex- (as per FAQ)? Thanks.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          For more specific advice you will need to show example data, as Carlo has suggested in #4.

          In general terms, the likely operating principle here is that any observation that has a missing value for any of the variables mentioned in the regression will be omitted from the estimation sample You have well over 30 variables in your model, so even with relatively low frequency, sporadic missingness among the variables, each of your observations has a substantial risk of being omitted. Apparently Stata found that all of the observations in the 4th category of pov_status contain a missing value somewhere (or, at least enough of them that the remaining observations were too few to support the regression.) You might want to do something like:
          Code:
          misstable summarize if pov_status == 4
          to get a sense of the frequency with which each variable has a missing value. You might, for example, find that there is one variable that is always missing when pov_status == 4. That would probably mean there is an error in the data management that created an incorrect data set. Or you might see relatively low frequencies of missing data in every variable--which might well be an unfixable problem short of getting better data.

          Comment

          Working...
          X