Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Loosing data when forcing appending multiple csv files

    Hello All,

    I am importing 49 csv files and then appending them into one file using the below code (see Original code with option to force append ignoring numeric and string miss-matches). The append does not occur unless I force stata to ignore variable type miss-matches (string and numeric). See the long list of notes from the append (Long list of ignored numeric and string miss-matches). This is unfortunately causing the loss of certain data as the change back and forth from string to numeric deletes previous values. Can I instead force that all variables are read in string? Complete the append? and then change everything that needs to be numeric to numeric?

    I tried adding option
    stringcols(all)
    for the import but I received this error message, I probably failed to insert it in the appropriate location. Any advice on this?

    Best wishes,
    Patrick

    Code:
     filelist, dir("/Users/Patrizio/Google Drive/Weather_SedentaryBehaviour/Weather_data") pat("*.csv") save("csv_datasets.dta")
    Number of files found = 49
    file csv_datasets.dta saved
    
    .          use "csv_datasets.dta", clear
    
    .          local obs = _N
    
    .          forvalues i=1/`obs' {
      2.            use "csv_datasets.dta" in `i', clear
      3.            local f = dirname + "/" + filename
      4.            insheet using "`f'", stringcols(all) clear
      5.            gen source = "`f'"
      6.            tempfile save`i'
      7.            save "`save`i''"
      8.          }
    option stringcols() not allowed
    r(198);
    
    end of do-file
    
    r(198);


    Original code with option to force append ignoring numeric and string miss-matches
    Code:
    ***************************************
    cd "/Users/Patrizio/Google Drive/Weather_SedentaryBehaviour/Weather_data"
     filelist, dir("/Users/Patrizio/Google Drive/Weather_SedentaryBehaviour/Weather_data") pat("*.csv") save("csv_datasets.dta")
             use "csv_datasets.dta", clear
             local obs = _N
             forvalues i=1/`obs' {
               use "csv_datasets.dta" in `i', clear
               local f = dirname + "/" + filename
               insheet using "`f'", clear
               gen source = "`f'"
               tempfile save`i'
               save "`save`i''"
             }
    
             use "`save1'", clear
             forvalues i=2/`obs' {
               append using "`save`i''", force
             }


    Long list of ignored numeric and string miss-matches
    Code:
    (note: variable source was str94, now str101 to accommodate using data's values)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndcm was byte, now int to accommodate using data's values)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was int in the using data, but will be str3 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable stationname was str10, now str11 to accommodate using data's values)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable stationname was str11, now str30 to accommodate using data's values)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was int in the using data, but will be str3 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was int in the using data, but will be str3 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable dataquality was str3 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable maxtempflag was str1 in the using data, but will be byte now)
    (note: variable mintempflag was str1 in the using data, but will be byte now)
    (note: variable meantempflag was str1 in the using data, but will be byte now)
    (note: variable heatdegdaysflag was str1 in the using data, but will be byte now)
    (note: variable cooldegdaysflag was str1 in the using data, but will be byte now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable dataquality was str3 in the using data, but will be byte now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable snowongrndflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable dataquality was str3 in the using data, but will be byte now)
    (note: variable totalrainflag was byte in the using data, but will be str1 now)
    (note: variable totalsnowflag was byte in the using data, but will be str1 now)
    (note: variable totalprecipflag was byte in the using data, but will be str1 now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable dataquality was str3 in the using data, but will be byte now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable dataquality was str3 in the using data, but will be byte now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable dataquality was str3 in the using data, but will be byte now)
    (note: variable dirofmaxgustflag was byte in the using data, but will be str1 now)
    (note: variable spdofmaxgustkmh was byte in the using data, but will be str3 now)
    (note: variable spdofmaxgustflag was byte in the using data, but will be str1 now)

  • #2
    Note:

    insheet has been superseded by import delimited. insheet continues to work but, as of Stata 13, is no
    longer an official part of Stata. This is the original help file, which we will no longer update, so
    some links may no longer work.
    and the option is -stringcols(_all)-

    Comment


    • #3
      Hello Scott,

      Thank you for raising my attention to this note.

      Here is my modified code that worked.

      "
      Code:
       cd "/Users/Patrizio/Google Drive/Weather_SedentaryBehaviour/Weather_data"
       filelist, dir("/Users/Patrizio/Google Drive/Weather_SedentaryBehaviour/Weather_data") pat("*.csv") save("csv_datasets.dta")
               use "csv_datasets.dta", clear
               local obs = _N
               forvalues i=1/`obs' {
                 use "csv_datasets.dta" in `i', clear
                 local f = dirname + "/" + filename
                 import delimited using "`f'", stringcols(_all) clear
                 gen source = "`f'"
                 tempfile save`i'
                 save "`save`i''"
               }
      
               use "`save1'", clear
               forvalues i=2/`obs' {
                 append using "`save`i''", force
               }
      Best wishes,
      Patrick

      Comment

      Working...
      X