Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • regression not running

    The code I have run below states there is no observations to be seen but when i look at the data editor, all the data is there i'm confused about what is wrong




    su Y1_smoke Y2_smoke Y3_smoke Y1_Male Y2_Male Y3_Male Y1_monthcigspend Y2_monthcigspend Y3_monthcigspend Y1_smokwsubs Y2_smokwsubs Y3_smokwsubs Y1_loosecigs Y2_loosecigs Y3_loosecigs Y1_DangerSmok Y2_DangerSmok Y3_DangerSmok Y1_TAX Y2_TAX Y3_TAX

    keep Y1_smoke Y2_smoke Y3_smoke Y1_Male Y2_Male Y3_Male Y1_monthcigspend Y2_monthcigspend Y3_monthcigspend Y1_smokwsubs Y2_smokwsubs Y3_smokwsubs Y1_loosecigs Y2_loosecigs Y3_loosecigs Y1_DangerSmok Y2_DangerSmok Y3_DangerSmok PID Y1_TAX Y2_TAX Y3_TAX FinalWgt Stratum PSU

    reshape long Y@_smoke Y@_Male Y@_monthcigspend Y@_smokwsubs Y@_loosecigs Y@_DangerSmok Y@_Tax, i(PID) j(year)

    *Run the Probit Regression with tax change variables as predictors
    probit Y_smoke i.Y1_TAX ii.Y2_TAX iii. Y3_TAX ///
    c.Y_Male c.Y_smokwsubs Y_monthcigspend c.Y_loosecigs c.Y_DangerSmok

  • #2
    Can you please explain (or correct)

    Code:
     ii.Y2_TAX iii. Y3_TAX
    noting also the embedded space?

    Comment


    • #3
      Yes i have made the changes removing the "///" but the error states that there are no observations available. The code runs fine when cleaning but after the reshaping, I have tried to run the regression it states no obsrervations r2000. could this be a merging problem?

      Comment


      • #4
        Gusieppe (not Giuseppe?):
        have you already ruled out a missing values problem?
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          i haven't, if there is a missing values problem. Could i use an egen command to substitue the missing values with values of the mean? (sorry i'm relatively new to stata)

          Comment


          • #6
            Gusieppe:
            you can use two commands to check for missing values in your dataset:
            Code:
            . use "C:\Program Files\Stata17\ado\base\a\auto.dta"
            (1978 automobile data)
            
            . sum price-foreign
            
                Variable |        Obs        Mean    Std. dev.       Min        Max
            -------------+---------------------------------------------------------
                   price |         74    6165.257    2949.496       3291      15906
                     mpg |         74     21.2973    5.785503         12         41
                   rep78 |         69    3.405797    .9899323          1          5
                headroom |         74    2.993243    .8459948        1.5          5
                   trunk |         74    13.75676    4.277404          5         23
            -------------+---------------------------------------------------------
                  weight |         74    3019.459    777.1936       1760       4840
                  length |         74    187.9324    22.26634        142        233
                    turn |         74    39.64865    4.399354         31         51
            displacement |         74    197.2973    91.83722         79        425
              gear_ratio |         74    3.014865    .4562871       2.19       3.89
            -------------+---------------------------------------------------------
                 foreign |         74    .2972973    .4601885          0          1
            
            .
            
            . codebook rep78
            
            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            rep78                                                                                                                                                         Repair record 1978
            --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
            
                              Type: Numeric (int)
            
                             Range: [1,5]                         Units: 1
                     Unique values: 5                         Missing .: 5/74
            
                        Tabulation: Freq.  Value
                                        2  1
                                        8  2
                                       30  3
                                       18  4
                                       11  5
                                        5  .
            
            .
            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              Type: Numeric (float)
              Label: Y3_smoke_lab

              Range: [1,2] Units: 1
              Unique values: 2 Missing .: 58,476/85,104

              Tabulation: Freq. Numeric Label
              10,095 1 1. experimented
              16,533 2 2. Did Not
              58,476 .
              the 58 476 were duplicates because once i had reshaped the data to a long format it duplicated it. the PID goes to 28 577 but the non missing values only accounts for 26 628 (85 104 - 58 476)

              Comment


              • #8
                Gusieppe:
                my previous example refers to -rep78- only, as it is the only variable with missing values in -auto.dta- dataset.
                However, you may have missing values in other variables too.
                As you know, Stata applies casewise deletion, ruling out from the -e(sample)- all the observations with missing values in any variable.
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  thank you for your reply carlo, I found that to be the case with a couple of the variables but how should i fix this?

                  Comment


                  • #10
                    Gusieppe:
                    this is the main issue, that also relates to casewise deletion.
                    If you have, say, 10 ids, 10 variables, with one missing value for each id in one of the ten variables, you end up with no observations at all.
                    Conversely, if you have, say, 1 id only with missing values in all the 10 variables, your -e(sample)- saves 9 out of 10 ids.
                    That said, if you cannot retrieve the missing data from the original source(s), the only way is to deal with missing values as suggested by tons of literature on this topic.
                    If you're unfamiliar with that stuff, taking a look at -mi- entries in Stata .pdf manual is the first step to take.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment

                    Working...
                    X