Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouping & labeling values of a numerical variable properly

    I have a list of SIC codes (2 digit numbers that I have named svcInd) that I need to label as follows:

    10-14 - "Mining"
    20-39 - "Manufacturing"
    40-48 - "Utilities"
    60-67 - "Finance"
    other 2 digit numbers - "Other"

    I have tabulated all 2 digit SIC codes, and, here is what it looks like:

    tab svcInd

    svcInd Freq. Percent Cum.

    2 4 0.09 0.09
    9 4 0.09 0.18
    10 185 4.17 4.35
    12 20 0.45 4.80
    13 100 2.25 7.05
    14 7 0.16 7.21
    15 69 1.55 8.77
    16 93 2.10 10.86
    17 2 0.05 10.91
    20 193 4.35 15.25
    21 19 0.43 15.68
    22 12 0.27 15.95
    23 17 0.38 16.34
    24 22 0.50 16.83
    25 6 0.14 16.97
    26 73 1.64 18.61
    27 21 0.47 19.09
    28 403 9.08 28.17
    29 158 3.56 31.73
    30 45 1.01 32.74
    31 5 0.11 32.85
    32 80 1.80 34.66
    33 138 3.11 37.76
    34 12 0.27 38.04
    35 193 4.35 42.38
    36 237 5.34 47.72
    37 163 3.67 51.40
    38 62 1.40 52.79
    39 15 0.34 53.13
    40 10 0.23 53.36
    41 3 0.07 53.42
    42 26 0.59 54.01
    44 37 0.83 54.84
    45 71 1.60 56.44
    47 37 0.83 57.28
    48 227 5.11 62.39
    49 324 7.30 69.69
    50 43 0.97 70.66
    51 20 0.45 71.11
    52 15 0.34 71.45
    53 29 0.65 72.10
    54 29 0.65 72.76
    55 10 0.23 72.98
    56 16 0.36 73.34
    57 5 0.11 73.46
    58 22 0.50 73.95
    59 21 0.47 74.43
    60 379 8.54 82.97
    61 43 0.97 83.93
    62 69 1.55 85.49
    63 114 2.57 88.06
    64 3 0.07 88.13
    65 104 2.34 90.47
    67 99 2.23 92.70
    70 23 0.52 93.22
    72 1 0.02 93.24
    73 149 3.36 96.60
    75 1 0.02 96.62
    78 11 0.25 96.87
    79 29 0.65 97.52
    80 3 0.07 97.59
    82 1 0.02 97.61
    83 1 0.02 97.63
    87 43 0.97 98.60
    99 62 1.40 100.00

    Total 4,438 100.00

    I tried labeling it (attached), but, the results are weird - each group ("Other", "Mining", "Manufacturing" .....) does not appear together when I use the tab command after labeling. To create 2 digit SVC industrial classification.do

    As seen above, I have a total of 4438 observations in my dataset. What is the best way to go about this kind of labeling? I would be grateful for some help.

    Here is an example using -dataex-

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float svcInd
     2
     2
     2
     2
     9
     9
     9
     9
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    10
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    12
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    13
    14
    14
    14
    14
    14
    14
    14
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    15
    16
    16
    16
    16
    16
    16
    16
    16
    16
    16
    16
    end
    label values svcInd svc

  • #2
    One solution could be to recode the values of your original svcInd variable into a new svcInd_g variable and tabulate this variable, instead:

    Code:
    recode svcInd (10/14 = 1 "Mining")        ///
                  (20/39 = 2 "Manufacturing") ///
                  (40/48 = 3 "Utilities")     ///
                  (60/67 = 4 "Finance")       ///
                  ( 1/99 = 5 "Other"), gen(svcInd_g)

    Comment


    • #3
      Dear Prof. Enzmann,

      Thank you so much for the Recode idea. It worked perfectly!

      Sincerely,
      Sunita Rao

      Comment

      Working...
      X