Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • increase/ increase the number of observation

    Good morning!

    I have a complicated question. I've been trying to solve it for two weeks.


    I have a file with a lot of observations (43 543).

    I have a task, therefore I do a regress. I do this regress:




    //regress employed
    regress employed GDP_10K life_expectancy unemployed_rate age_55_60 age_60_65 age_65_70 not_alone Tenure civil_servant ph003_ mh002_

    est store reg1


    //regress unemployed
    regress unemployed GDP_10K life_expectancy unemployed_rate age_55_60 age_60_65 age_65_70 not_alone Tenure civil_servant ph003_ mh002_


    est store reg2


    //regress retired
    regress retired GDP_10K life_expectancy unemployed_rate age_55_60 age_60_65 age_65_70 not_alone Tenure civil_servant ph003_ mh002_


    est store reg3

    esttab reg1 reg2 reg3, b se wide

    The important sections are bold. The leftovers are only information from my dataset and the regress command. After the run, I got a table with the results and the number of observations. The number of observations is too low, therefore I have bad results.


    The number of observations is too low. I have to increase the number of observations.



    What do you usually do in this situation?
    What should I do?
    What should I check or change?



    I tried to write my situation in detail. But if you need more information about my situation just ask me.

    Thank you

  • #2
    my guess (please read and follow the advice in the FAQ which has lots of advice both on how to ask questions and how to include info that will help people give you an answer) is that you have missing values on at least some of the variables in your regression; what you should do about depends on why there are missing values and what type of missingness you have

    Comment


    • #3
      Thank you for your reply.

      I've tried to check missing values, but I don't have a missing value. Only in variable Tenure are missing values, but if I change the missing value to 0 I get too many observations (all 43 543).

      Do you have any idea?

      Thank you, you are kind!

      Comment


      • #4
        Steve:
        Stata applies listwise deletion by default to observations with missing value(s) in one or more variables.
        Hence, all the observations that have -Tenure-==. will be ruled out from -regress-.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment


        • #5
          That was helpful, thank you.


          The task includes these 12 variables ( employed GDP_10K life_expectancy unemployed_rate age_55_60 age_60_65 age_65_70 not_alone Tenure civil_servant ph003_ mh002_). 11 variables do not contain a missing value. Only Tenure contains missing values. But if I replace the missing values in the Tenure variable with 0, I have too many observations (I know that from there because they have been told in advance how much observation I should have).

          What can you suggest to me?

          Thank you

          Comment


          • #6
            Steve:
            I fail to get what you mean by "too many observations".
            If you actually replace missing values with zero, you should have a complete (although probably unreliable) dataset.
            As per FAQ, an example would help enormously (see -dataex-). Thanks.

            Kind regards,
            Carlo
            (Stata 19.0)

            Comment


            • #7
              My job is to already reproduce an existing research. That’s why I know I need 16,000 observations (or at least about 16,000).

              My example:

              employed unemployed retired GDP_10K life_expectancy unemployed_rate age_55_60 age_60_65 age_65_70 not_alone Tenure civil_servant ph003_ mh002
              1 0 0 3.15 70 3.00 1 0 0 0 . 0 1 1
              0 1 0 2.00 75 3.50 0 1 0 1 35 1 0 1
              0 1 0 1.10 60 5.50 1 0 0 0 41 0 0 0
              0 0 1 2.10 88 2.20 0 0 1 1 . 1 1 0
              0 means FALSE and 1 means TRUE so e.g.:
              In the 1. row the observation is an employed -> who is between 55 and 60 years old -> who lives with sombody (not alone) -> about his/her tenure no information (missing value) -> wasn´t civil servant and so on..


              Only in the variable Tenure are missing values. If there stay missing values the number of observations is 8 350. If I replace missing values to 0 the number of observations will be 43 543.

              I cannot add more variables (because I reproduce an existing research so I cannot change them).

              So my question:
              What can you suggest to me? How can I increase the number of observation?


              Thank you

              Comment


              • #8
                Steve:
                are you dealing with a panel or a cross-sectional dataset?
                Kind regards,
                Carlo
                (Stata 19.0)

                Comment


                • #9
                  Originally posted by Carlo Lazzaro View Post
                  Steve:
                  are you dealing with a panel or a cross-sectional dataset?
                  I have panel

                  Comment


                  • #10
                    Steve:
                    that's the reason why the number ob observations increases when you replace missing with zero.
                    Kind regards,
                    Carlo
                    (Stata 19.0)

                    Comment

                    Working...
                    X