Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiply entries of observations

    Hi,

    Please see below, how would I multiply the observation by the number that is stated as the variable "number"?

    For example, in bold, there are 142410 people who are in "full-time employment", from "England", and inclusive of "all" kinds of qualifications. However, this row of criteria is counted as 1 obervation by stata instead of 141210. How would I correct this? Would I need to multiply these rows by the number stated in the "number" column? So that I generate 142410 rows of "Full-time employment" "England" "All"?

    Many thanks for any advice.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str53 activity str20 domicile str46 levelofqualificationobtained long number
    "Full-time employment" "England"              "All"              142410
    "Full-time employment" "Non-European Union"   "All"               30835
    "Full-time employment" "Northern Ireland"     "All"                6705
    "Full-time employment" "Not known"            "All"                  10
    "Full-time employment" "Other European Union" "All"               13180
    "Full-time employment" "Other UK"             "All"                 450
    "Full-time employment" "Scotland"             "All"               13350
    "Full-time employment" "Total"                "All"              214660
    "Full-time employment" "Total UK"             "All"              170630
    "Full-time employment" "Total non-UK"         "All"               44015
    "Full-time employment" "Wales"                "All"                7720
    "Full-time employment" "England"              "All"              153275
    "Full-time employment" "Non-European Union"   "All"               32530
    "Full-time employment" "Northern Ireland"     "All"                7135
    "Full-time employment" "Not known"            "All"                  10
    "Full-time employment" "Other European Union" "All"               14340
    "Full-time employment" "Other UK"             "All"                 490
    "Full-time employment" "Scotland"             "All"               14425
    "Full-time employment" "Total"                "All"              230595
    "Full-time employment" "Total UK"             "All"              183715
    "Full-time employment" "Total non-UK"         "All"               46870
    "Full-time employment" "Wales"                "All"                8390
    "Full-time employment" "England"              "All"              116625
    "Full-time employment" "Non-European Union"   "All"               29430
    "Full-time employment" "Northern Ireland"     "All"                5070
    "Full-time employment" "Other European Union" "All"               12025
    "Full-time employment" "Other UK"             "All"                 385
    "Full-time employment" "Scotland"             "All"               11150
    "Full-time employment" "Total"                "All"              180850
    "Full-time employment" "Total UK"             "All"              139390
    "Full-time employment" "Total non-UK"         "All"               41455
    "Full-time employment" "Wales"                "All"                6160
    "Full-time employment" "England"              "All"              126910
    "Full-time employment" "Non-European Union"   "All"               31075
    "Full-time employment" "Northern Ireland"     "All"                5470
    "Full-time employment" "Other European Union" "All"               13150
    "Full-time employment" "Other UK"             "All"                 425
    "Full-time employment" "Scotland"             "All"               12190
    "Full-time employment" "Total"                "All"              196025
    "Full-time employment" "Total UK"             "All"              151795
    "Full-time employment" "Total non-UK"         "All"               44225
    "Full-time employment" "Wales"                "All"                6805
    "Full-time employment" "England"              "All"               25785
    "Full-time employment" "Non-European Union"   "All"                1405
    "Full-time employment" "Northern Ireland"     "All"                1630
    "Full-time employment" "Not known"            "All"                  10
    "Full-time employment" "Other European Union" "All"                1155
    "Full-time employment" "Other UK"             "All"                  60
    "Full-time employment" "Scotland"             "All"                2200
    "Full-time employment" "Total"                "All"               33810
    "Full-time employment" "Total UK"             "All"               31240
    "Full-time employment" "Total non-UK"         "All"                2560
    "Full-time employment" "Wales"                "All"                1560
    "Full-time employment" "England"              "All"               26365
    "Full-time employment" "Non-European Union"   "All"                1455
    "Full-time employment" "Northern Ireland"     "All"                1665
    "Full-time employment" "Not known"            "All"                  10
    "Full-time employment" "Other European Union" "All"                1190
    "Full-time employment" "Other UK"             "All"                  60
    "Full-time employment" "Scotland"             "All"                2240
    "Full-time employment" "Total"                "All"               34570
    "Full-time employment" "Total UK"             "All"               31920
    "Full-time employment" "Total non-UK"         "All"                2645
    "Full-time employment" "Wales"                "All"                1585
    "Full-time employment" "England"              "All postgraduate"  46175
    "Full-time employment" "Non-European Union"   "All postgraduate"  25445
    "Full-time employment" "Northern Ireland"     "All postgraduate"   1750
    "Full-time employment" "Not known"            "All postgraduate"     10
    "Full-time employment" "Other European Union" "All postgraduate"   8365
    "Full-time employment" "Other UK"             "All postgraduate"    120
    "Full-time employment" "Scotland"             "All postgraduate"   4720
    "Full-time employment" "Total"                "All postgraduate"  89090
    "Full-time employment" "Total UK"             "All postgraduate"  55270
    "Full-time employment" "Total non-UK"         "All postgraduate"  33810
    "Full-time employment" "Wales"                "All postgraduate"   2505
    "Full-time employment" "England"              "All postgraduate"  47405
    "Full-time employment" "Non-European Union"   "All postgraduate"  26400
    "Full-time employment" "Northern Ireland"     "All postgraduate"   1790
    "Full-time employment" "Not known"            "All postgraduate"     10
    "Full-time employment" "Other European Union" "All postgraduate"   8720
    "Full-time employment" "Other UK"             "All postgraduate"    125
    "Full-time employment" "Scotland"             "All postgraduate"   4820
    "Full-time employment" "Total"                "All postgraduate"  91850
    "Full-time employment" "Total UK"             "All postgraduate"  56720
    "Full-time employment" "Total non-UK"         "All postgraduate"  35120
    "Full-time employment" "Wales"                "All postgraduate"   2575
    "Full-time employment" "England"              "All postgraduate"  29665
    "Full-time employment" "Non-European Union"   "All postgraduate"  24155
    "Full-time employment" "Northern Ireland"     "All postgraduate"    985
    "Full-time employment" "Other European Union" "All postgraduate"   7350
    "Full-time employment" "Other UK"             "All postgraduate"     75
    "Full-time employment" "Scotland"             "All postgraduate"   3415
    "Full-time employment" "Total"                "All postgraduate"  67325
    "Full-time employment" "Total UK"             "All postgraduate"  35815
    "Full-time employment" "Total non-UK"         "All postgraduate"  31510
    "Full-time employment" "Wales"                "All postgraduate"   1675
    "Full-time employment" "England"              "All postgraduate"  30645
    "Full-time employment" "Non-European Union"   "All postgraduate"  25075
    "Full-time employment" "Northern Ireland"     "All postgraduate"   1020
    "Full-time employment" "Other European Union" "All postgraduate"   7680
    end

  • #2
    However, this row of criteria is counted as 1 obervation by stata instead of 141210.
    Without knowing the Stata command you used that did not give you the results you expect, it is difficult to advise.

    Would I need to multiply these rows by the number stated in the "number" column? So that I generate 142410 rows of "Full-time employment" "England" "All"?
    Certainly not.

    Read the output of
    Code:
    help weight
    It seems likely likely you will want to use your variable number as a frequency weight.

    Comment


    • #3
      Hi William,

      Thank you for your help!

      It seems likely likely you will want to use your variable number as a frequency weight.[/QUOTE]
      This is indeed what I was trying to do.
      tab modeofformerstudy [fweight= number]
      Mode of
      former
      study Freq. Percent Cum.
      All 16,028,690 50.00 50.00
      Full-time 13,567,260 42.32 92.32
      Part-time 2,461,085 7.68 100.00
      Total 32,057,035 100.00

      My follow-up question is that I would want to perform regression on two or more variables all weighted by the frequency as 'number'. But stata replies invalid variable.

      reg levelofqualificationobtained [fweight= number] domicile [fweight= number]
      invalid 'domicile'
      r(198);


      Many thanks!
      Lynn




      Comment


      • #4
        as the help file for -regress- shows, there is only one weight allowed and its specification follows the listing of all variables in the model; so, you probably want
        Code:
        reg levelofqualificationobtained domicile [fweight= number]

        Comment


        • #5
          Another possibility is the -expand- command, which actually creates duplicate observations. However, if weights will work, that's a preferable approach.

          Comment


          • #6
            I think weighting is preferable, in this case, to expanding the dataset to the 32,057,035 observations implied by post #3.

            In any event, though, all of your observations with sub-totals adding up counts in other observations should be omitted from your tabulations and your models and any other analysis. In your data, I suspect that observations where one of these variables has a value containing the word "Total" or "All" should be dropped, but you need to look at your data carefully to see if there are more subtle possibilities.

            I would create a new dataset using the drop command (one or more times) to eliminate the unwanted observations from the current dataset, and then use the new dataset for analysis.

            Comment


            • #7
              Thanks Rich, Mike, and William!

              I have generated new variables and replaced categories using numeric terms excluding those categories called "Total" or "All, which has solved the problem. Many thanks for everyone's help!

              Comment

              Working...
              X