Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Importing .GRD data into Stata

    Dear all

    I have a simple question but cannot seem to get it to work.

    I have downloaded gridded temperature data from 1980 to 2015 from India Meteorological Department. There are separate files for the maximum and minimum temperature, which means that i have 72 files. Further the data is in the .grd and .txt format.

    My issue is how should I import this data in Stata so that I have one file which contains the date, maximum temperature, minimum temperature, latitude and longitude as seperate variables in the long "format"?
    this is what the data in the .txt file looks like

    Code:
      DAILY MINIMUM TEMPARATURE  
    DTMTYEAR  LAT.  67.5  68.0  68.5  69.0  69.5  70.0  70.5  71.0  71.5  72.0  72.5  73.0  73.5  74.0  74.5  75.0  75.5  76.0  76.5  77.0  77.5  78.0  78.5  79.0  79.5  80.0  80.5  81.0  81.5  82.0  82.5  83.0  83.5  84.0  84.5  85.0  85.5  86.0  86.5  87.0  87.5  88.0  88.5  89.0  89.5  90.0  90.5  91.0  91.5  92.0  92.5  93.0  93.5  94.0  94.5  95.0  95.5  96.0  96.5  97.0  97.5
    01011981   7.5 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90 99.90
    (sorry if i have demonstrated it incorrectly, i copied it from the .txt file).
    Last edited by Jaweriah Abdullah; 08 Jan 2020, 07:08.

  • #2
    This can get you started. I have made some assumptions based on your data example, so if these are incorrect, you can use the code and adjust accordingly.
    I assume that the columns after latitude are temperature readings (the second data row) taken at intervals of longitude (the first data row). I'm not sure why this particular year has 99.90 values for all observations, but that is not a problem for importing. The data are then reshaped into long format, and a sample is shown following the code.

    Code:
    clear *
    cls
    version 16.0
    
    import delim using dat.grd, asdouble delim(" ", collapse) rowr(2:) varname(2) case(lower) stringcol(1)
    
    gen long date = date(dtmtyear, "DMY")
    format date %tdCY-m-D
    drop dtmtyear
    
    foreach v of varlist v* {
      local vname = subinstr(string(`: var lab `v'', "%02.1f"), ".", "", .)
      rename `v' v`vname'
    }
    
    reshape long v , i(date lat) j(lon)
    rename v temp
    
    recast double lon
    replace lon = lon / 10
    Code:
               date   lat    lon   temp  
        1981-Jan-01   7.5   67.5   99.9  
        1981-Jan-01   7.5     68   99.9  
        1981-Jan-01   7.5   68.5   99.9  
        1981-Jan-01   7.5     69   99.9  
        1981-Jan-01   7.5   69.5   99.9  
        1981-Jan-01   7.5     70   99.9  
        1981-Jan-01   7.5   70.5   99.9  
        1981-Jan-01   7.5     71   99.9  
        1981-Jan-01   7.5   71.5   99.9

    Comment


    • #3
      Leonardo Guizzetti did essentially what I would have done, to which I can only add that 99.9 is likely to be a code for missing. A direct replacement such as

      Code:
      replace temp = . if temp == 99.9
      may not work because of precision problems. but

      Code:
      replace temp = . if temp > 99.8 
      should work fine,

      Comment


      • #4
        Thank you very much, Leonardo Guizzetti and Nick Cox for the codes.

        I was able to run the code till here

        Code:
        clear *
        cls
        
        import delim using MAXT1980.txt, asdouble delim(" ", collapse) rowr(2:) varname(2) case(lower) stringcol(1)
        
        gen long date = date(dtmtyear, "DMY")
        format date %tdCY-m-D
        drop dtmtyear
        
        foreach v of varlist v* {
          local vname = subinstr(string(`: var lab `v'', "%02.1f"), ".", "", .)
          rename `v' v`vname'
        }
        but then got the message

        Code:
         You specified i(date lat) and j(lon). In the current wide form, variable date lat should uniquely identify the observations.
        I checked and found that the column lat contained the value "LAT." in several rows. So I dropped the "LAT." values and ran the rest of the code.

        Code:
        reshape long v , i(date lat) j(lon)
        rename v temp
        
        recast double lon
        replace lon = lon / 10
        replace temp = . if temp == 99.9
        replace temp = . if temp > 99.8
        This is my final result
        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input long date str5 lat double(lon temp)
        7305 "10.0" 67.5     .
        7305 "10.0"   68     .
        7305 "10.0" 68.5     .
        7305 "10.0"   69     .
        7305 "10.0" 69.5     .
        7305 "10.0"   70     .
        7305 "10.0" 70.5     .
        7305 "10.0"   71     .
        7305 "10.0" 71.5     .
        7305 "10.0"   72     .
        7305 "10.0" 72.5     .
        7305 "10.0"   73     .
        7305 "10.0" 73.5     .
        7305 "10.0"   74     .
        7305 "10.0" 74.5     .
        7305 "10.0"   75     .
        7305 "10.0" 75.5     .
        7305 "10.0"   76 30.81
        7305 "10.0" 76.5 29.97
        7305 "10.0"   77 30.28
        7305 "10.0" 77.5 30.07
        7305 "10.0"   78 29.33
        7305 "10.0" 78.5 30.51
        7305 "10.0"   79 29.88
        7305 "10.0" 79.5     .
        7305 "10.0"   80     .
        7305 "10.0" 80.5     .
        7305 "10.0"   81     .
        7305 "10.0" 81.5     .
        7305 "10.0"   82     .
        7305 "10.0" 82.5     .
        7305 "10.0"   83     .
        7305 "10.0" 83.5     .
        7305 "10.0"   84     .
        7305 "10.0" 84.5     .
        7305 "10.0"   85     .
        7305 "10.0" 85.5     .
        7305 "10.0"   86     .
        7305 "10.0" 86.5     .
        7305 "10.0"   87     .
        7305 "10.0" 87.5     .
        7305 "10.0"   88     .
        7305 "10.0" 88.5     .
        7305 "10.0"   89     .
        7305 "10.0" 89.5     .
        7305 "10.0"   90     .
        7305 "10.0" 90.5     .
        7305 "10.0"   91     .
        7305 "10.0" 91.5     .
        7305 "10.0"   92     .
        7305 "10.0" 92.5     .
        7305 "10.0"   93     .
        7305 "10.0" 93.5     .
        7305 "10.0"   94     .
        7305 "10.0" 94.5     .
        7305 "10.0"   95     .
        7305 "10.0" 95.5     .
        7305 "10.0"   96     .
        7305 "10.0" 96.5     .
        7305 "10.0"   97     .
        7305 "10.0" 97.5     .
        7305 "10.5" 67.5     .
        7305 "10.5"   68     .
        7305 "10.5" 68.5     .
        7305 "10.5"   69     .
        7305 "10.5" 69.5     .
        7305 "10.5"   70     .
        7305 "10.5" 70.5     .
        7305 "10.5"   71     .
        7305 "10.5" 71.5     .
        7305 "10.5"   72     .
        7305 "10.5" 72.5     .
        7305 "10.5"   73     .
        7305 "10.5" 73.5     .
        7305 "10.5"   74     .
        7305 "10.5" 74.5     .
        7305 "10.5"   75     .
        7305 "10.5" 75.5     .
        7305 "10.5"   76 30.04
        7305 "10.5" 76.5 28.66
        7305 "10.5"   77 28.78
        7305 "10.5" 77.5 28.89
        7305 "10.5"   78 29.29
        7305 "10.5" 78.5 30.51
        7305 "10.5"   79 29.67
        7305 "10.5" 79.5 29.58
        7305 "10.5"   80     .
        7305 "10.5" 80.5     .
        7305 "10.5"   81     .
        7305 "10.5" 81.5     .
        7305 "10.5"   82     .
        7305 "10.5" 82.5     .
        7305 "10.5"   83     .
        7305 "10.5" 83.5     .
        7305 "10.5"   84     .
        7305 "10.5" 84.5     .
        7305 "10.5"   85     .
        7305 "10.5" 85.5     .
        7305 "10.5"   86     .
        7305 "10.5" 86.5     .
        end
        format %tdCY-m-D date
        I suppose the duplicate LAT values were duplicate headings.?

        As I have 72 files (the maximum temperature for every year is a separate file and the minimum temperature for every year is a separate file) is there a way that i can perhaps run the code for all of them and merge them??


        Last edited by Jaweriah Abdullah; 08 Jan 2020, 10:16.

        Comment


        • #5
          Jaweriah Abdullah , could you please check the link you've posted in msg #1 above?
          I am getting:
          Click image for larger version

Name:	broken_link.png
Views:	1
Size:	8.7 KB
ID:	1531292


          Please post the link that provides the *.GRD data files directly or confirm, please, whether these data are in the open access.

          Thank you, Sergiy

          Comment


          • #6
            Dear Sergiy Radyakin the temperature data is from India Meteorological Department. However, I did not download my data from the website instead it was initially bought form their Pune office and then updated.
            I have heard that they have put the data files in the public domain on their website (annual data in separate files) so I believe that it must be available on the website at present (https://mausam.imd.gov.in/)
            The link was unintentional, as I copied the name from my work sheet where i retained its online link (Sorry should have clarified it).

            Comment


            • #7
              Dear Leonardo Guizzetti and Nick Cox,

              I actually managed to import and append the files the codes that you mentioned and this thread (https://www.statalist.org/forums/for...multiple-files). I was hoping you could look at it and point out if I did something that was out of place (as I moved some of the lines).
              I ran the following code for my Maximum temperature and saved it.
              Code:
              *** LOOP TO IMPORTING AND APPENDING MULTIPLE FILES ***
              clear
              // get a list of all the relevant files in your directory
              cd "C:\Users\pc5\Desktop\Grid_Data\Grid_Data\0.5 deg temp\0.5 deg temp\0.5_MaxT"
              local flist: dir "." files "*.txt"
              // import each file in the list and save it to a tempfile
              local nfile 0
              foreach fname of local flist {
                  local ++nfile    
                  import delimited "`fname'", asdouble delim(" ", collapse) rowr(2:) varname(2) case(lower) stringcol(1) clear    
                  local datestring = subinstr("`fname'", ".csv", "", .) // no need for a variable
                  gen long date = date("`dtmtyear'", "DMY")
                  format date %tdCY-m-D
                  //
                  tempfile temp`nfile'
                  save "`temp`nfile''"
              }    
              // append allof the nfile files
              clear
              forval i = 1/`nfile' {
                append using "`temp`i''"
              }
              gen long ddate = date(dtmtyear, "DMY")
              format ddate %tdCY-m-D
              drop dtmtyear
              foreach v of varlist v* {
                local vname = subinstr(string(`: var lab `v'', "%02.1f"), ".", "", .)
                rename `v' v`vname'
              }
              drop if lat=="LAT."
              reshape long v , i(ddate lat) j(lon)
              rename v temp
              recast double lon
              replace lon = lon / 10
              replace temp = . if temp == 99.9
              replace temp = . if temp > 99.8
              g day=day(ddate)
              g month=month(ddate)
              g year=year(ddate)
              destring lat, replace
              drop date
              order ddate day month year lat lon temp
              rename temp MAX_TEMP
              save "C:\Users\pc5\Desktop\Grid_Data\MAX_TEMP.dta"
              I ran the same code for the Minimum temperature file and saved it. Then I merged both the files using
              Code:
              merge 1:1  day month year lat lon
              This is what my final file looks like
              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input long ddate float(DAY MONTH YEAR) double(LAT LON MAX_TEMP MIN_TEMP)
              17871 5 12 2008 24 75.5 31.42 15.08
              17871 5 12 2008 24   76 31.28 14.44
              17871 5 12 2008 24 76.5 31.32 14.15
              17871 5 12 2008 24   77 31.42  13.8
              17871 5 12 2008 24 77.5 30.58 13.56
              17871 5 12 2008 24   78 30.38 13.67
              17871 5 12 2008 24 78.5 30.47 13.27
              17871 5 12 2008 24   79 29.89  12.8
              17871 5 12 2008 24 79.5 29.72  12.8
              17871 5 12 2008 24   80 29.58 12.06
              17871 5 12 2008 24 80.5  28.8 11.07
              17871 5 12 2008 24   81 28.16 11.33
              17871 5 12 2008 24 81.5 27.78 10.97
              17871 5 12 2008 24   82 27.17 10.93
              17871 5 12 2008 24 82.5 27.79 10.87
              17871 5 12 2008 24   83 27.39 10.77
              17871 5 12 2008 24 83.5 27.25 10.82
              17871 5 12 2008 24   84 26.86 11.01
              17871 5 12 2008 24 84.5 26.52 11.55
              17871 5 12 2008 24   85 27.26 12.38
              17871 5 12 2008 24 85.5 27.26 13.23
              17871 5 12 2008 24   86 27.33 13.64
              17871 5 12 2008 24 86.5 27.49 14.53
              17871 5 12 2008 24   87 27.73 15.09
              17871 5 12 2008 24 87.5 28.02  15.5
              17871 5 12 2008 24   88 28.24 15.98
              17871 5 12 2008 24 88.5 28.29 16.41
              end
              format %tdCY-m-D ddate
              Once again thanx (also to Mike Lacywhose posts I read!!!)
              Moving on to the next step with my temperature data I could use Nick Cox’s advice but I will refrain from posting it here and instead start a separate thread.
              Last edited by Jaweriah Abdullah; 09 Jan 2020, 01:26.

              Comment


              • #8
                One problem still remains that the code does not work with the .GRD files. how can I adapt it to the .GDR files?
                By does not work I mean that when i use the code for the .GRD files by substituting .gdr for .txt in the code above I get the following

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input byte(aaaûîaª7èaëçalèaÿsèaðêaëatêarêak aìahðajmìa v10 v11)
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                . . . .
                end
                How can I adapt it to .grd files?

                Comment

                Working...
                X