Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Destring a variable

    Dear Stata Users,

    First, I want to create a year and month from "v1". I use the code below:

    Code:
    g year = substr(v1, 1, 4)
    g month = substr(v1, 5, 6)
    However, when I try to destring "year" and "month" I have the following thread:

    year contains nonnumeric characters; no replace
    Can you please help me how can I destring "year" and "month"? Below I attach a sample from my data.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str32 v1 float(me1 me2 me3 me4)
    "192512" 5.82 15.61 50.01    1319
    "192601" 5.91  15.9 50.92 1331.71
    "192602" 5.55 14.62 46.63 1366.39
    "192603"    5  13.2 44.56 1322.46
    "192604" 5.03 13.75    45 1350.21
    "192605"  5.1 13.27  45.6 1382.58
    "192606" 5.17 13.76    48 1510.46
    "192607" 5.28  13.8 49.09 1521.25
    "192608" 5.38 14.52 47.43 1561.71
    "192609" 5.15  13.8 47.07 1576.54
    "192610" 4.94 13.13 45.96 1581.94
    "192611"  5.1 13.98 45.47 1608.91
    "192612" 5.28 14.38 49.83 1595.12
    "192701" 5.69 14.94 50.92 1628.38
    "192702" 6.12 15.85 50.87 1690.91
    "192703" 6.12 15.14 50.05 1768.07
    "192704" 5.96  14.7 49.95 1729.49
    "192705" 6.17  15.4 50.86 1841.78
    "192706" 6.19 14.66 48.62 1774.64
    "192707" 6.51 15.12 53.55 1925.16
    "192708" 6.49 15.13 54.31  2140.2
    "192709" 6.72 16.32  58.4  2314.2
    "192710" 6.37 16.24 56.94 2231.55
    "192711" 6.84 17.01 58.27 2257.65
    "192712" 7.15 18.84 60.27  2401.2
    "192801" 7.26 18.13 60.97 2335.95
    "192802" 6.95 17.96  59.5 2381.62
    "192803" 7.49 20.02 62.81  3253.8
    "192804"  7.7  20.9 65.93 3292.95
    "192805" 7.97 21.11 65.99  3358.2
    "192806"  7.1 19.97 62.29    3306
    "192807"  6.9 20.36 62.27 3347.32
    "192808" 7.19 21.59 67.02 3536.55
    "192809"  7.5  22.8 69.53 3769.27
    "192810" 7.93  22.4 70.07 3806.25
    "192811" 8.69 24.89 76.04  3680.1
    "192812" 8.92 26.06 83.62 3545.25
    "192901" 9.04 26.85 85.08 3637.69
    "192902"    9 27.69 83.31 3643.12
    "192903" 8.75 25.95 83.07  3697.5
    "192904" 8.69 26.04 81.13 3675.75
    "192905" 7.81 23.81 77.34    3045
    "192906" 8.37 25.62 83.76  3262.5
    "192907" 8.61 25.82 82.68 3516.82
    "192908" 8.64 26.43 88.06 3926.02
    "192909" 8.14  24.7 80.61 3860.29
    "192910" 6.44 20.55 70.46 3244.02
    "192911" 5.39 16.25 57.75  2946.5
    "192912"  5.2 15.75 58.65 2939.89
    "193001" 5.68 16.95 65.17 2958.06
    "193002" 5.97 18.03 64.92 3179.38
    "193003" 6.25 18.85 64.83 3488.23
    "193004" 6.33    18 60.68 3329.68
    "193005" 5.86 17.28 60.24  3588.5
    "193006" 4.44  14.4 50.54  3257.7
    "193007" 4.58 14.96 52.29 3284.78
    "193008" 4.21 14.08 51.69 3809.98
    "193009" 3.68 12.07 42.64 3580.54
    "193010" 3.36 10.84 38.11 3439.35
    "193011" 3.23 10.61 37.93 3315.81
    "193012"  2.6  9.04 33.44 3152.55
    "193101" 3.21 10.11 37.44 3306.98
    "193102" 3.59 11.32 39.51 3507.74
    "193103"  3.3 10.42 37.25 3403.53
    "193104" 2.73  8.75 33.18 3277.39
    "193105" 2.34  7.49 27.16 2971.05
    "193106" 2.81  9.05 33.74 3234.59
    "193107" 2.59  8.05 30.66  3058.9
    "193108" 2.37  7.99 30.71 3088.18
    "193109" 1.52   5.4 20.37 2353.86
    "193110" 1.75  5.49 22.91 2482.26
    "193111"  1.5   5.1 19.86 2329.08
    "193112" 1.12  4.06 16.42 2178.32
    "193201" 1.25  4.27 16.34 2087.36
    "193202" 1.25  4.24 16.67 2353.24
    "193203"  1.1  3.64 15.04 2057.04
    "193204"  .87  3.04 12.78 1826.15
    "193205"  .65  2.26  8.86 1644.24
    "193206"  .66  2.29  9.35 1434.33
    "193207"  .94  3.03  12.3 1674.56
    "193208" 1.58  4.78 18.26 2136.34
    "193209" 1.47  4.38 17.54 2096.69
    "193210" 1.21  3.66 15.06 1933.44
    "193211" 1.06  3.41  14.1 1912.45
    "193212"   .9  3.11 13.85 1943.18
    "193301"  .97   3.3 15.07 1950.18
    "193302"  .76  2.73 12.45 1810.21
    "193303"  .87  2.87 13.42 1651.59
    "193304" 1.21  4.25 20.71  1866.2
    "193305"  2.1  6.26 25.76 2190.45
    "193306" 2.56  7.58 28.71 2377.07
    "193307" 2.42   6.7 26.06  2258.1
    "193308" 2.48  7.81 31.17 2365.41
    "193309" 2.09  6.43 26.24 2244.11
    "193310" 1.77  5.73 23.72 2085.48
    "193311" 1.86   5.9  25.3 2213.78
    "193312" 1.84  6.21 26.67 2083.15
    "193401" 2.62  7.73 31.81 2202.12
    "193402" 2.81  7.86 30.55 2248.77
    "193403" 2.69   7.7 31.26 2237.11
    end

  • #2
    The destring works successfully on your example data, so the problem must lie elsewhere in your dataset. Try this
    Code:
    list v1 if missing(real(substr(v1, 1, 4)))
    to show you what observations have values of v1 containing non-numeric characters in positions 1-4.

    Comment

    Working...
    X