Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Grouping based on first two digit numbers

    Hello Statalists,

    There are two variables. Products 'artigo' and economic groups' CAE'. I am going to group economics activities (CAE) based on the first two digit numbers of products (artigo) which are 8 digits.
    For instance, if CAE=132 and artigo = 55 calls the category 'Manufacture of textiles'.
    I would appreciate it if anyone could have a look at that.


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(artigo CAE)
     743 145
    3742 228
    8968 150
    7786 203
    8570 150
    7799 136
    8210 137
    7353 111
    8806 148
    4761 148
    8784 148
    8205 155
    8795 136
    8781 136
    8782 136
    8790 136
    8757 136
    5328 139
    8970 212
    4265 148
    7915 228
    7783 228
    7906 228
    7829 228
    7888 228
    8844 228
    7645 228
    7803 228
    7874 228
    8461 228
    8100 228
    7806 228
    8035 228
    5465 143
    8908 143
    3389 143
    3316 143
    5682 143
    3406 143
    6343 134
    3095 134
    4137 134
    5326 134
    3539 134
    7948 134
    3456 134
    8282 134
    3136 134
    3100 134
    5034 143
    4965 150
    3072 255
    8195 124
    5647 124
    5555 124
    6314 124
    5328 124
    5328 255
    3915 148
    7833 148
    8757 148
    6913 148
    7613 148
    7711 148
    6219 148
    6684 148
    7768 148
    8794 148
    8784 148
    6906 148
    6295 148
    7922 148
    7390 148
    8188 136
    1519 136
    1506 136
    1511 136
    1597 136
    1484 136
    4128 136
    4150 136
    8623 136
    7853 152
    7807 149
    8708 149
    8712 149
    8696 149
    7806 149
    8700 149
    8721 149
    7806 149
    8714 149
    8702 149
    3723 149
    8097 149
    8097 149
    8704 149
    3732 149
    7801 149
    8092 149
    end
    label values artigo artigo
    label def artigo 743 "07019090", modify
    label def artigo 1484 "19023090", modify
    label def artigo 1506 "19053199", modify
    label def artigo 1511 "19053299", modify
    label def artigo 1519 "19059060", modify
    label def artigo 1597 "20081919", modify
    label def artigo 3072 "33059000", modify
    label def artigo 3095 "34029090", modify
    label def artigo 3100 "34039900", modify
    label def artigo 3136 "35069900", modify
    label def artigo 3316 "38249045", modify
    label def artigo 3389 "39073000", modify
    label def artigo 3406 "39095090", modify
    label def artigo 3456 "39191019", modify
    label def artigo 3539 "39269097", modify
    label def artigo 3723 "42022290", modify
    label def artigo 3732 "42029215", modify
    label def artigo 3742 "42033000", modify
    label def artigo 3915 "44182080", modify
    label def artigo 4128 "48201010", modify
    label def artigo 4137 "48211010", modify
    label def artigo 4150 "48239040", modify
    label def artigo 4265 "52041900", modify
    label def artigo 4761 "57050030", modify
    label def artigo 4965 "61051000", modify
    label def artigo 5034 "61142000", modify
    label def artigo 5326 "63079099", modify
    label def artigo 5328 "63090000", modify
    label def artigo 5465 "68051000", modify
    label def artigo 5555 "69120050", modify
    label def artigo 5647 "70133799", modify
    label def artigo 5682 "70193900", modify
    label def artigo 6219 "73130000", modify
    label def artigo 6295 "73211110", modify
    label def artigo 6314 "73239490", modify
    label def artigo 6343 "73269098", modify
    label def artigo 6684 "83021000", modify
    label def artigo 6906 "84181080", modify
    label def artigo 6913 "84183020", modify
    label def artigo 7353 "84659200", modify
    label def artigo 7390 "84678100", modify
    label def artigo 7613 "85021120", modify
    label def artigo 7645 "85044055", modify
    label def artigo 7711 "85098000", modify
    label def artigo 7768 "85165000", modify
    label def artigo 7783 "85171200", modify
    label def artigo 7786 "85176200", modify
    label def artigo 7799 "85182200", modify
    label def artigo 7801 "85182995", modify
    label def artigo 7803 "85183095", modify
    label def artigo 7806 "85184089", modify
    label def artigo 7807 "85185000", modify
    label def artigo 7829 "85198919", modify
    label def artigo 7833 "85219000", modify
    label def artigo 7853 "85234059", modify
    label def artigo 7874 "85258019", modify
    label def artigo 7888 "85272120", modify
    label def artigo 7906 "85284991", modify
    label def artigo 7915 "85287111", modify
    label def artigo 7922 "85287233", modify
    label def artigo 7948 "85311095", modify
    label def artigo 8035 "85393900", modify
    label def artigo 8092 "85439000", modify
    label def artigo 8097 "85442000", modify
    label def artigo 8100 "85444290", modify
    label def artigo 8188 "87032390", modify
    label def artigo 8195 "87033290", modify
    label def artigo 8205 "87042139", modify
    label def artigo 8210 "87042299", modify
    label def artigo 8282 "87089997", modify
    label def artigo 8461 "90141000", modify
    label def artigo 8570 "90262080", modify
    label def artigo 8623 "90314990", modify
    label def artigo 8696 "92011010", modify
    label def artigo 8700 "92021010", modify
    label def artigo 8702 "92029030", modify
    label def artigo 8704 "92051000", modify
    label def artigo 8708 "92059090", modify
    label def artigo 8712 "92071050", modify
    label def artigo 8714 "92079010", modify
    label def artigo 8721 "92099400", modify
    label def artigo 8757 "94016100", modify
    label def artigo 8781 "94035000", modify
    label def artigo 8782 "94036010", modify
    label def artigo 8784 "94036090", modify
    label def artigo 8790 "94039090", modify
    label def artigo 8794 "94042910", modify
    label def artigo 8795 "94042990", modify
    label def artigo 8806 "94052019", modify
    label def artigo 8844 "95030075", modify
    label def artigo 8908 "96034010", modify
    label def artigo 8968 "97011000", modify
    label def artigo 8970 "97020000", modify
    label values CAE caerev3
    label def caerev3 111 "329", modify
    label def caerev3 124 "412", modify
    label def caerev3 134 "453", modify
    label def caerev3 136 "461", modify
    label def caerev3 137 "462", modify
    label def caerev3 139 "464", modify
    label def caerev3 143 "469", modify
    label def caerev3 145 "472", modify
    label def caerev3 148 "475", modify
    label def caerev3 149 "476", modify
    label def caerev3 150 "477", modify
    label def caerev3 152 "479", modify
    label def caerev3 155 "493", modify
    label def caerev3 203 "731", modify
    label def caerev3 212 "773", modify
    label def caerev3 228 "829", modify
    label def caerev3 255 "960", modify

  • #2
    It is probably unfortunate that you at some point chose to convert string variables to numeric variables by using the encode command rather than the destring command.

    The encode command is designed for assigning numerical codes to non-numeric strings like "France", "Germany", "United States". The output of help encode instructs us

    Do not use encode if varname contains numbers that merely happen to be stored as strings; instead, use generate newvar = real(varname) or destring; see real() or [D] destring.

    Comment


    • #3
      Following on from #2, you should either revert to the original variables (before encoding), or do the following to recover them:

      Code:
      decode CAE, gen(cae_orig)
      decode artigo, gen(artigo_orig)
      destring cae_orig artigo_orig, replace
      drop CAE artigo
      rename *_orig *
      Now you can work with the original categories rather easily, e.g.

      Code:
      gen wanted = "Manufacture of textiles" if cae == 132 & floor(artigo/1000000) == 55

      Comment


      • #4
        Thank you so much, dear Kumar & William. It worked perfectly.

        Comment

        Working...
        X