Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Unwanted*names in a variable

    Dear Statalisters,

    I directly copied the dataset from Access and pasted it into the Excel. From the large dataset with 16763 observations of individual patients, a variable "province" which is defined as "the state where a sample originated", I tabulated the variable which resulted into giving me the frequency of submissions by state, but then some unwanted names (numbers, and those in CAPS) appeared in the output. I checked the Excel file and discovered that no such names exist under the variable "province"; however, those names in CAPS actually appear under the variable "district" which is linked with each province; the numbers seems to be dates of sample collection. Please how can I correct this problem?
    Province Freq. Percent Cum.
    0 3 0.02 0.02
    1 3 0.02 0.04
    10 4 0.02 0.06
    12 1 0.01 0.07
    2 7 0.04 0.11
    4 3 0.02 0.13
    42965 1 0.01 0.13
    42971 1 0.01 0.14
    5 3 0.02 0.15
    6 2 0.01 0.17
    9 1 0.01 0.17
    Abia 212 1.26 1.43
    Adamawa 601 3.58 5.01
    Akwa Ibom 343 2.04 7.05
    Anambra 228 1.36 8.41
    Bauchi 623 3.71 12.12
    Bayelsa 157 0.93 13.06
    Benue 436 2.60 15.65
    Borno 740 4.41 20.06
    Cross River 269 1.60 21.66
    Delta 325 1.93 23.59
    Ebonyi 205 1.22 24.81
    Edo 522 3.11 27.92
    Ekiti 382 2.27 30.20
    Enugu 375 2.23 32.43
    FCT, Abuja 498 2.96 35.39
    Gombe 430 2.56 37.95
    Imo 381 2.27 40.22
    Jigawa 865 5.15 45.37
    Kaduna 567 3.38 48.75
    Kano 1,412 8.41 57.15
    Katsina 817 4.86 62.02
    Kebbi 802 4.77 66.79
    Kogi 259 1.54 68.33
    Kwara 152 0.90 69.24
    LAYIN ASIBITI 1 0.01 69.24
    Lagos 440 2.62 71.86
    Nasarawa 336 2.00 73.86
    Niger 349 2.08 75.94
    Ogun 384 2.29 78.23
    Ondo 374 2.23 80.45
    Osun 280 1.67 82.12
    Oyo 331 1.97 84.09
    Plateau 582 3.46 87.56
    Rivers 402 2.39 89.95
    Sokoto 481 2.86 92.81
    TERMANA 1 0.01 92.82
    TSWANKYAM 1 0.01 92.83
    Taraba 363 2.16 94.99
    UMUALIKA 1 0.01 94.99
    WUKARI 1 0.01 95.00
    Yobe 492 2.93 97.93
    Zamfara 348 2.07 100.00
    Total 16,797 100.00
    Last edited by Aminu Shittu; 08 May 2018, 22:16.

  • #2
    It is impossible to answer this question without seeing an example of the data that produced the problem. Please post back showing example data, and please make sure that the example you show does result in the problem you are having. Please use the -dataex- command to do this. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.



    Comment


    • #3
      Just a side note, after Clyde's advice.

      You may wish to - codebook Province - and check it out.

      That said, there are other issues to be solved as well, before being confident that the data set is not under a messy condition.

      For example, you said the sample has 16763 observations. Well, since the cumulative frequency was equal 16797, we shall have 34 "unexpected" observations. However, if we count the "unwanted names", there are only 29 of them.
      Best regards,

      Marcos

      Comment


      • #4
        Hi Clyde and Marcos,

        It is now working when I used import excel .

        Thank you,

        Aminu.

        Comment

        Working...
        X