Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Get all variables until break in pattern

    I have loads of unstructured data in the following rough format. I think I need to keep one variable at a time, reshape and then build it back together.

    I am wondering if this can be automated somehow? There is this pattern that the wide values have "v" + "digit" format and the break in the time series is shown by the series name as a variable.

    Note that the time length of the series is uneven, so the procedure would have to take that into account.

    Any ideas?


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input int id float(population v151 v152 v153 v154 v155 v156) str9 employees float(v157 v158 v159 v160 v161 v162 v163 v164)
       .  2000  2001  2002  2003  2004  2005  2006 "1997"     2007  2008  2009 2010  2011  2012  2013  2014
    1001    58    57    55    54    54    54    56 "37.547"     57    57    57   56    56    56    57    58
    1002   155   155   153   151   152   152   154 "102.161"   156   158   159  159   160   161   163   167
    1003   118   118   117   116   114   113   116 "80.390"    118   121   121  123   125   126   124   123
    1004    46    47    46    45    45    44    44 "31.786"     45    46    47   47    48    48    48    49
    1051    60    60    58    56    56    56    55 "35.371"     56    57    56   56    57    57    58    58
    1053    64    64    64    64    64    64    64 "38.497"     65    65    65   65    67    67    67    67
    1054    81    81    81    80    80    80    80 "46.960"     81    83    84   86    87    88    88    88
    1055    84    84    84    82    82    82    82 "52.379"     82    83    83   84    85    86    87    88
    1056   120   121   120   119   119   118   118 "74.724"    120   122   122  122   124   125   126   126
    1057    44    43    43    42    42    42    42 "23.376"     43    43    44   43    43    42    42    43
    1058   110   112   112   109   109   109   110 "63.769"    111   113   113  112   113   114   114   114
    1059    77    76    76    75    75    74    74 "42.472"     76    78    78   78    79    79    80    81
    1060   113   114   113   112   113   112   113 "74.758"    115   116   115  114   116   117   118   119
    1061    58    58    57    56    56    55    55 "34.051"     55    55    55   55    56    56    56    55
    1062    95    96    96    96    97    98    99 "62.772"    101   103   106  106   106   107   107   109
    2000 1.051 1.058 1.051 1.039 1.042 1.051 1.062 "732.322" 1.087 1.115 1.131 1.14 1.157 1.178 1.193 1.202
    3101   151   150   146   145   146   145   146 "107.796"   149   150   150  150   152   155   158   161
    3102    58    58    58    56    56    56    56 "46.728"     56    56    56   55    55    56    56    57
    3103    96    99   100   107   107   105   104 "73.363"    105   108   109  110   117   125   129   129
    3151    51    50    51    52    53    53    52 "30.670"     52    53    54   54    54    55    56    57
    3153    68    67    66    64    64    63    63 "48.270"     63    62    62   62    62    62    62    62
    3154    31    31    29    29    30    29    29 "21.015"     29    29    29   29    29    30    30    29
    3155    61    60    59    59    59    58    58 "42.806"     59    60    60   60    60    60    60    60
    3157    46    45    43    44    44    43    43 "28.290"     43    43    43   43    44    44    44    44
    3158    35    35    35    35    36    36    36 "22.069"     36    36    36   36    36    36    36    36
    3159   161   161   164   161   162   160   160 "116.634"   163   165   165  167   168   170   170   172
    3241   599   594   612   591   598   599   604 "427.236"   611   619   625  627   634   639   647   657
    3251    87    86    84    88    87    86    87 "50.474"     88    90    91   91    94    96    96    97
    3252    75    75    75    72    72    71    70 "50.438"     70    70    70   70    70    70    69    70
    3254   128   126   122   122   124   122   122 "84.871"    122   122   122  122   123   124   123   123
    3255    34    34    33    31    31    31    30 "23.075"     30    30    30   30    30    30    30    30
    3256    54    53    50    51    51    51    51 "32.081"     51    51    51   52    53    54    55    56
    3257    61    61    61    60    60    60    62 "40.289"     61    62    62   61    62    62    62    63
    3351    77    76    74    75    76    75    75 "48.361"     75    76    76   76    77    79    79    81
    3352    68    68    68    68    68    67    67 "39.691"     68    68    69   69    70    71    71    71
    3353    73    73    74    74    75    76    78 "42.352"     80    83    84   84    86    87    89    89
    3354    20    20    19    19    20    20    20 "12.858"     20    20    20   20    20    20    20    20
    3355    70    70    70    71    70    71    71 "44.466"     74    76    77   77    78    79    80    80
    3356    36    36    37    35    36    36    36 "19.843"     37    37    37   38    38    39    39    38
    3357    73    72    72    72    72    71    71 "41.552"     74    75    77   78    78    80    81    81
    3358    66    65    67    65    66    65    65 "39.419"     67    67    67   67    68    70    70    70
    3359    77    77    76    77    78    78    79 "48.229"     80    82    83   83    86    87    88    88
    3360    41    40    41    40    40    39    39 "26.074"     40    40    40   39    40    41    41    41
    3361    56    56    57    57    57    56    56 "37.841"     57    58    58   59    60    60    60    61
    3401    33    32    32    31    31    31    30 "18.840"     31    31    31   31    31    31    29    29
    3402    37    38    36    36    35    35    35 "25.845"     36    36    37   38    39    41    41    42
    3403   100   101   102   102   101   101   103 "62.778"    104   106   107  109   110   111   112   113
    3404   118   117   112   114   113   112   113 "78.059"    115   117   117  117   119   122   123   124
    3405    48    46    45    45    45    45    45 "26.280"     44    44    44   45    45    46    46    45
    3451    48    48    48    48    49    50    50 "30.139"     52    53    53   54    55    56    56    57
    3452    69    69    69    70    70    69    70 "40.691"     72    73    75   76    78    79    80    81
    3453    62    63    65    65    67    68    69 "38.917"     72    75    76   77    79    82    83    84
    3454   143   144   144   144   146   147   151 "86.152"    158   162   163  166   170   175   177   178
    3455    42    42    41    41    41    40    39 "24.009"     39    40    40   40    41    41    41    41
    3456    59    59    58    58    59    59    60 "34.131"     62    63    63   64    66    67    67    68
    3457    58    59    59    60    60    60    62 "33.534"     65    66    67   68    68    69    70    70
    3458    43    42    43    43    44    44    45 "24.872"     46    47    48   49    50    51    51    52
    3459   139   140   141   141   142   141   143 "90.747"    147   151   151  154   158   160   162   163
    3460    67    67    69    70    72    73    75 "40.631"     78    80    82   84    87    89    91    91
    3461    37    37    36    36    36    36    36 "24.830"     37    37    38   38    38    38    39    39
    3462    23    22    22    22    22    22    23 "12.814"     23    23    24   24    24    24    24    24
    4011   334   335   336   334   335   334   337 "237.371"   342   342   340  340   346   351   353   355
    4012    60    60    58    57    58    58    60 "45.244"     62    64    64   64    63    64    63    63
    5111   459   462   459   454   454   458   462 "333.553"   472   490   494  493   497   502   509   511
    5112   223   218   214   216   218   216   217 "157.511"   222   226   225  222   222   224   225   226
    5113   314   315   310   306   304   302   304 "217.237"   312   314   316  318   321   322   322   325
    5114   124   123   122   121   121   119   119 "87.811"    120   120   118  117   118   118   118   119
    5116   124   124   123   120   121   120   121 "83.503"    123   125   124  124   126   127   129   130
    5117    80    79    79    78    78    78    78 "59.144"     79    80    81   81    82    83    82    81
    5119    93    92    92    91    91    90    88 "59.315"     89    90    90   90    92    93    93    92
    5120    65    64    63    62    62    61    61 "48.410"     61    62    60   59    60    61    59    59
    5122    74    73    72    71    71    70    71 "49.793"     72    72    69   69    70    71    71    72
    5124   181   177   172   168   165   164   163 "129.764"   164   166   163  163   165   165   166   169
    5154   128   126   127   128   131   130   130 "72.752"    131   132   132  133   136   137   140   141
    5158   237   237   235   232   236   236   238 "165.183"   241   244   241  241   241   243   244   244
    5162   186   185   185   183   185   187   188 "123.564"   191   192   192  190   193   194   194   196
    5166   125   125   125   124   126   126   123 "80.576"    125   126   124  124   126   125   125   127
    5170   176   176   177   176   181   182   182 "113.940"   185   187   188  188   191   194   195   195
    5314   214   213   211   212   213   214   217 "143.212"   222   232   234  232   233   235   236   237
    5315   630   637   646   629   629   630   640 "432.112"   653   669   674  676   691   702   717   728
    5316    78    78    78    76    78    78    77 "67.227"     77    77    78   80    80    81    82    82
    5334   274   274   275   272   273   271   270 "174.489"   274   279   276  279   283   285   287   290
    5358   108   108   107   107   108   109   110 "70.090"    110   111   111  111   113   116   116   117
    5362   174   174   174   172   174   173   175 "110.002"   178   181   182  185   188   191   191   189
    5366    73    73    72    73    73    73    74 "44.186"     75    76    76   77    79    79    80    81
    5370    88    88    89    90    92    91    91 "48.792"     92    93    94   94    96    98   100   101
    5374   129   130   129   128   129   129   130 "82.929"    132   134   131  131   134   136   137   138
    5378   102   102   103   103   103   103   104 "62.668"    105   107   106  106   107   108   109   111
    5382   206   208   208   208   215   214   213 "118.574"   216   222   223  224   223   226   228   231
    5512    47    46    47    46    46    47    47 "31.306"     48    48    48   48    48    48    47    46
    5513   111   109   108   107   107   107   107 "82.036"    108   108   111  113   114   115   115   115
    5515   183   183   187   184   189   187   190 "118.065"   194   198   199  202   207   210   213   217
    5554   179   178   176   175   176   175   176 "106.481"   180   184   184  187   192   197   200   203
    5558    89    88    88    87    88    88    89 "49.307"     91    91    91   91    92    93    94    96
    5562   245   235   236   237   238   237   236 "160.441"   237   238   237  236   238   238   237   237
    5566   190   190   192   191   192   192   195 "114.230"   200   203   204  204   208   211   212   215
    5570   130   124   123   123   122   120   121 "80.363"    123   126   125  124   125   127   127   128
    5711   182   182   181   178   179   180   181 "127.342"   183   185   185  185   189   192   195   198
    5754   185   184   183   182   183   185   187 "126.681"   192   196   195  194   199   204   206   209
    end

  • #2
    I have here some code to get the first variables until the break and then I could with these variables reshape.

    But this would only work once, how to loop this over the different parts?

    Code:
    **** Try to automate this a bit
    ds id, not
    tokenize `r(varlist)'                                
    local label : variable label `1'
    local start: subinstr local 2 "v" "", all
    local rename = `start' - 1
    rename `1' v`rename'
    
    local vars id
    local stop = 0
    
    foreach v in `r(varlist)' {
        if `stop' == 0 {
            local name "`v'"
            local len : strlen local name        
            local first = substr("`name'", 1, 1)    
            if "`first'"=="v" & `len'==4 {
                local vars `vars' `v'
                disp "`vars'"
            }
            else {
                local stop = 1
                disp "stopping"
            }    
        }
    }
    
    disp "`vars'"

    Comment

    Working...
    X