Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    It is indeed interesting! Gotcha2 explains it:

    One feature of preserve that catches many users by surprise is that, if you preserve data in a do file, the data is automatically and silently restored when the do file finishes even if no restore command has been reached (including when the do file crashes!)
    Adding the line

    Code:
    tab enquetecible
    within the dofile will clear the mystery.

    Comment


    • #32
      I've tried but I am still getting the same 95 observations for all the variables I try to tabulate.

      Comment


      • #33
        Add the following two commands before restore at the end of the do-file and copy and paste the output as I do below. Make sure the last statement that you see is "end of do-file".

        Code:
        tab Educ_1
        tab enquetecible
        My output:

        Code:
        . tab Educ_1
        
         Ecoles: organisée & |
                 bien gérées |      Freq.     Percent        Cum.
        ---------------------+-----------------------------------
         En  total désaccord |          9        9.47        9.47
               En  désaccord |         35       36.84       46.32
                      Neutre |         19       20.00       66.32
                    D'accord |         28       29.47       95.79
        Tout à fait d’accord |          4        4.21      100.00
        ---------------------+-----------------------------------
                       Total |         95      100.00
        
        . tab enquetecible
        
                        Enquêté cîble |      Freq.     Percent        Cum.
        ------------------------------+-----------------------------------
                               Ménage |      1,258       92.98       92.98
        Ménage et Unité de production |         95        7.02      100.00
        ------------------------------+-----------------------------------
                                Total |      1,353      100.00
        
        . *********************************************************************
        . /*
        >
        > restore
        > *************************************
        > ** MODULE CIVISME FISCAL **
        > *************************************
        >
        
        end of do-file

        Comment


        • #34
          Dear Andrew, Thank you for the response. the added part is available in the dofile I have which has another part that I didn't put in the one I've uploaded here. However, from the output above, it appears clearly that though we are supposed to have the output of (tab Educ_1) for 1353 observations, it is appearing only for the 95 onces representing a small part of the set of units which have responded to the questionnaire. That is where my true problem resides. I have no clue on what can be the reason for that exclusion of the rest of observations.

          Comment


          • #35
            In fact, if I try
            Code:
             keep if inlist(enquetecible,1)
            or
            Code:
             keep if inlist(enquetecible, 2)
            all the variables loose their observations, even those for which there is no missing observations.!

            Comment


            • #36
              I suggest that you look at the responses for the variable "Educ_1". While it is true that you have 1353 observations for the variable "enquetecible", you have only 95 responses for the former.

              Code:
              . tab Educ_1 if enquetecible==3
              
                                 Educ_1 |      Freq.     Percent        Cum.
              --------------------------+-----------------------------------
                              D'accord  |         28       29.47       29.47
                  En  total
              désaccord   |          9        9.47       38.95
                        En  désaccord   |         35       36.84       75.79
                                 Neutre |         19       20.00       95.79
                   Tout à fait d’accord |          4        4.21      100.00
              --------------------------+-----------------------------------
                                  Total |         95      100.00
              
              . tab Educ_1 if enquetecible==1
              no observations
              
              .

              Comment


              • #37
                I suggest that you look at the responses for the variable "Educ_1". While it is true that you have 1353 observations for the variable "enquetecible", you have only 95 responses for the former.
                No, in the dataset that I've attached in post #21 even for the "Educ_1" there more than 1200 non-missing observations. It is this situation that is really difficult to understand.

                Comment


                • #38
                  We have different datasets then. Importing the dataset and without doing anything else, this is what I get. From my side, I see nothing inconsistent.

                  Code:
                  . import excel "C:\Users\709554\Desktop\BD_all1_versions_25.01.2020.xlsx", sheet("perception_qlty") firstrow cl
                  > ear
                  (549 vars, 1,682 obs)
                  
                  . tab Educ_1
                  
                                     Educ_1 |      Freq.     Percent        Cum.
                  --------------------------+-----------------------------------
                                  D'accord  |         28       29.47       29.47
                      En  total
                  désaccord   |          9        9.47       38.95
                            En  désaccord   |         35       36.84       75.79
                                     Neutre |         19       20.00       95.79
                       Tout à fait d’accord |          4        4.21      100.00
                  --------------------------+-----------------------------------
                                      Total |         95      100.00
                  EDIT: Maybe the confusion is caused by how you define an observation. Yes, the dataset has 1682 observations, but when you tabulate, you get the frequency of non-missing observations for a given variable. The above shows that Educ_1 has 95 non-missing observations.
                  Last edited by Andrew Musau; 17 Mar 2020, 14:37.

                  Comment


                  • #39
                    Dear Andrew,
                    Many thanks for all. there was a replication of variables in the datasets and when I wrote my dofile I did not realize that. So while I was referring to the replicates of the same variables with their number of observations STATA was considering the initial ones which were restricted to just a sub-sample. After being aware of this issue I think everything is now fine with. I've just obtained this:

                    Code:
                    . tab contact12m_Mairie, nolab
                    
                    contact12m_ |
                         Mairie |      Freq.     Percent        Cum.
                    ------------+-----------------------------------
                              0 |      1,488       94.30       94.30
                              1 |         59        3.74       98.04
                              3 |         31        1.96      100.00
                    ------------+-----------------------------------
                          Total |      1,578      100.00

                    Many thanks.

                    Comment


                    • #40
                      I also have a similar question. I want to destring a non-numeric variable named iid1 to numeric. For example, one of the observations of iid1 is "Q1V1110002110101". I have also used the "encode" code [for instance, encode iid1, gen (iidnew)] but STATA cannot run it stating, "too many values". Can anyone give me any suggestion to rectify this issue? It is essential for me to run a nested logit model.

                      Comment


                      • #41
                        #40 You should not try to destring a variable with values like "Q1V1110002110101". It does not qualify as having numeric content presented as string.

                        As encode fails given too many distinct values, and you need a numeric identifier, try

                        Code:
                        egen long numid = group(iid1)
                        Last edited by Nick Cox; 25 Jan 2025, 01:22.

                        Comment


                        • #42
                          Since this thread started Clyde Schechter and I wrote https://journals.sagepub.com/doi/epd...867X1801800413 as an attempt to bring together the most common details you might need to know in this territory.

                          Comment


                          • #43
                            Hi Reetamarghya Dey! That looks a lot like a string from the PLFS. I think it's encoding something along the lines of:

                            Quarter: Q1
                            Visit: V1
                            Sector: 1 (Rural)
                            State: 10 (Bihar)
                            District: 0021 (Khagaria)
                            Region: 101 (Northern)
                            Stratum: 1

                            (I don't understand why district appears to be four digits instead of two digits, but maybe it's been through a processing step?)

                            This is an unusual set of characteristics to group together into a single variable. Most of the time you'd be better served by importing these bytes as seven different variables. Perhaps if you let us know what you intended to do with this variable we'd be better able to advise you how to deal with it. For example, maybe what you really want is a variable that takes a unique value for each stratum?

                            Comment

                            Working...
                            X