Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • egen command is only generating missings

    Hello,

    i am using egen in Stata 16.1

    In my previous data anlysis the code below worked pretty well.

    The code looked somehting like:

    egen Faktor1_specific = mean(Faktor1) if vote ==1

    the dummy variable vote and also Faktor are having values and seemed to be corred

    The problem is that only missings are generated when using egen for Faktor1_specific.
    I do get value output for vote and for Faktor1, but i get only missings when i use egen Faktor1_specific


    Thanks for any help !!!!
    Last edited by Daniel Koelzer; 13 Jul 2020, 06:27.

  • #2
    egen Faktor1_specific = mean(Faktor1) if vote ==1
    The -if- qualifier excludes observations for which vote is not equal to 1. Therefore, missing values are generated for these observations. What is your expectation?

    Comment


    • #3
      Hi Daniel, welcome to the Statalist. Since you are new here, you would be well advised to spend time to read the FAQ. One section to focus on is how to provide a reproducible data example, and to show the exact code used and output produced by Stata.

      Since there is no data example, I can only speculate. Reasonable causes for producing all missing values could be that the condition -if vote==1- is never satisfied, or the variable -Faktor1- is always missing.Check your data, and try leaving off the -if- condition to see if something is calculated.

      Comment


      • #4
        Thanks for the advice Leonardo and sorry for my rushed first post!

        I have a data set with questions about general elections and when recoding the variable of the votes(n11ba) i have late on problems when using the egen.

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input int n11ba
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        .f
        end
        label values n11ba n11ba
        label def n11ba .f "nicht in Auswahlgesamtheit", modify
        " "nicht in Auswahlgesamtheit" "This is also the error message i see in my graph which i generate later on.

        Code:
        recode n11ba (1=1) (4=2) (5=3) (6=4) (7=5) (322=6) (801=7), gen(zweitstimme)
            label define Partei 1 "CDU/CSU" ///
                                2 "SPD"     ///
                                3 "FDP"        ///
                                4 "Grüne"    ///
                                5 "Linke"    ///
                                6 "AfD"        ///
                                7 "Sonstige", replace
            label val zweitstimme Partei
            label var zweitstimme "Zweitstimme BTW 2017"
            tab zweitstimme
        The previous part is working when i use "tab" or "fre", but when i try to calculate mean position of the vote only missings are generated.

        Code:
        egen Faktor_ökon_cdu = mean(Faktor_ökon) if zweitstimme == 1
        (4291 missing values generated)

        Sorry if i poorly presented my dataset. I am still learning.


        Thanks in advice
        Last edited by Daniel Koelzer; 13 Jul 2020, 10:53.

        Comment


        • #5
          Leonardo Guizzetti had excellent suggestions about what to try in #3. I can't see that you have tried any of them.

          Your example says nothing about Faktor_ökon. Nor does your recode do anything about missing values in n11ba -- which are all we can see in the data example.

          Please show us the results of

          Code:
            
           tab Faktor_ökon if zweitstimme == 1  
           su Faktor_ökon if zweitstimme == 1


          Non-missing values will be expected for the mean you ask for if (and only if) there are some non-missing values in the variable you feed to it satisfying your condition
          Code:
            
           if zweitstimme == 1


          A different issue is that the variable you ask for can contain at most one distinct non-missing value, so what you want to do with it?

          Comment


          • #6
            Nick Cox thanks for your advices.

            Refering to the suggestion of Leonardo Guizzetti i tryed out
            Code:
            egen Faktor_ökon_cdu = mean(Faktor_ökon)
            but nothing was calculated. Actually stata wasn't doing anything at all.


            Your example says nothing about Faktor_ökon. Nor does your recode do anything about missing values in n11ba -- which are all we can see in the data example.
            Nick Cox Sorry i forgot to mention that for the missings of the data set there comes a foreach loop
            Code:
            foreach var of varlist _all {
                capture confirm numeric var `var'
                if !_rc {
                    mvdecode `var', mv(-71=.p \-72=.o \-81=.n \-82=.m \-83=.l ///
                             \-84=.k \-85=.j \-86=.i \-92=.h \-93=.g ///
                             \-94=.f \-95=.e \-96=.d \-97=.c \-98=.b ///
                             \-99=.a)
                            
                    if "`:value label `var''" == "" {
                        continue
                    }
                    label define `:value label `var'' ///
                        .a"keine Angabe" ///
                        .b"weiss nicht" ///
                        .c"trifft nicht zu" ///
                        .d"Split" ///
                        .e"nicht teilgenommen" ///
                        .f"nicht in Auswahlgesamtheit" ///
                        .g"Interview abgebrochen" ///
                        .h"Fehler in Daten" ///
                        .i"nicht wahlberechtigt" ///
                        .j"nicht waehlen" ///
                        .k"keine Erst-/Zweitstimme abgegeben" ///
                        .l"ungueltig waehlen" ///
                        .m"keine andere Partei waehlen" ///
                        .n"noch nicht entschieden" ///
                        .o"nicht einzuschaetzen" ///
                        .p"nicht bekannt", modify
                }
                else {
                    replace `var'=".a keine Angabe" if `var'=="-99 keine Angabe"
                    replace `var'=".c trifft nicht zu" if `var'=="-97 trifft nicht zu"
                    replace `var'=".e nicht teilgenommen" if `var'=="-95 nicht teilgenommen"
                    replace `var'=".g Interview abgebrochen" if `var'=="-93 Interview abgebrochen"
                    }
            }
            The results for
            Code:
             tab Faktor_ökon if zweitstimme == 1
            = no observations


            Code:
            su Faktor_ökon if zweitstimme == 1
            
            =
            Variable    Obs    Mean    Std.    Dev.    Min    Max
                                    
            Faktor_ökon    0


            n11ba has the following values
            Click image for larger version

Name:	n11ba.png
Views:	1
Size:	53.4 KB
ID:	1563251



            A different issue is that the variable you ask for can contain at most one distinct non-missing value, so what you want to do with it?
            I am analysing the two dimensional political area. Therefore i wanted to have mean positions of cultural and economic (Faktor_ökon) voting behaviour depending on the party voted for. Later on i want to run a multibinary regression to identifiy influencing factors for the voting behaviour.

            Thanks in advance!



            Last edited by Daniel Koelzer; 13 Jul 2020, 13:44.

            Comment


            • #7
              The mean of Faktor_ökon is not defined if all its values are missing, as the summarize and tabulate commands both confirm. What's puzzling is why you are surprised at that. As far as I can gather from the loop in #6 you recoded various numeric codes to various missing codes, which could make perfect sense, except that there is evidently no scope to calculate a mean, even over arbitrary numeric codes.

              Comment


              • #8
                Nick Cox i ran a principal components factor (pcf) analysis in the beginning and prediced the two factor scores based on 6 superissue questions. All questions i have recoded and scaled uniformly from 0 to 10.

                Code:
                factor freieWirtschaft_std Umverteilung_std SteuernvsLeistungen Assimilation_std ///
                           UnterstützungSchulden_std Liberal_std, pcf
                
                rotate, promax
                
                predict Faktor_kult Faktor_ökon
                Why i am so suprised and obviously dont understand at all my problem is that the egen command is actually working with other varaibles such as education (Bildung), or class membership (Klasse)

                Code:
                    * Mittlere Positionen nach Bildung
                        * ökonomische Dimension
                        egen Faktor_ökon_Bildung0=mean(Faktor_ökon) if Bildung==0
                        egen Faktor_ökon_Bildung1=mean(Faktor_ökon) if Bildung==1
                        egen Faktor_ökon_Bildung2=mean(Faktor_ökon) if Bildung==2
                        * kulturelle Dimension
                        egen Faktor_kult_Bildung0=mean(Faktor_kult) if Bildung==0
                        egen Faktor_kult_Bildung1=mean(Faktor_kult) if Bildung==1
                        egen Faktor_kult_Bildung2=mean(Faktor_kult) if Bildung==2
                There the new variables still have values. but when i use vote (n11ba) its not working at all.

                What can be a solution for it?

                Comment


                • #9
                  Sorry, but I can't add helpfully to previous replies. You seem to be jumping back and forth between Faktor_ökon and n11ba in your question.

                  Comment

                  Working...
                  X