Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dataset Format

    Hey everyone!

    I was wondering if anyone could shed light on the the ".data" file format? I'm trying to follow an textbook for epidemiology, but the supplemental data files on the book website gives a ".data" file that I can't seem to figure out.

    Any help would be appreciated!

  • #2
    Joshua:
    you may be interested in the following link: https://extension.nirsoft.net/data.
    Please also note that Stata extension for dataset files is .dta.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Is the web site freely accessible? If so and you post a link here, perhaps someone will be able to figure it out.

      If the authors don't say anything about what program is needed to read the data, it could be as simple as being a plain text or CSV dataset with a non-standard extension.

      Comment


      • #4
        Originally posted by Carlo Lazzaro View Post
        Joshua:
        you may be interested in the following link: https://extension.nirsoft.net/data.
        Please also note that Stata extension for dataset files is .dta.
        Thank you! I will check this out!

        Comment


        • #5
          Originally posted by William Lisowski View Post
          Is the web site freely accessible? If so and you post a link here, perhaps someone will be able to figure it out.

          If the authors don't say anything about what program is needed to read the data, it could be as simple as being a plain text or CSV dataset with a non-standard extension.
          That make sense! Here's the link: https://global.oup.com/us/companion....9780199755967/

          Comment


          • #6
            Originally posted by Joshua Tobias View Post
            Hey everyone!

            I was wondering if anyone could shed light on the the ".data" file format? I'm trying to follow an textbook for epidemiology, but the supplemental data files on the book website gives a ".data" file that I can't seem to figure out.

            Any help would be appreciated!
            HERE'S THE BOOK WEBSITE WITH THE SUPPLEMENTAL FILES: https://global.oup.com/us/companion....9780199755967/

            Comment


            • #7
              I followed the link to "Stata Data Sets" and downloaded stata_data.zip. I uncompressed it and indeed saw a collection of files with the .data extension. I can open each of them in a plain text editor so they indeed are simple text files with an inappropriate extension.

              One of them is named gene.data. From Stata's File menu I chose Import > Text data and these are the results. I suspect the import delimited command generated by the dialog will work on each of these simple text files.
              Code:
              . type "gene.data", lines(5)
              French 0.21 0.06 0.06 0.67 0.00 0.43 0.01 0.14 0.01 0.02 0.39 0.55 0.45
              Czech 0.25 0.04 0.14 0.57 0.01 0.42 0.01 0.15 0.00 0.01 0.40 0.53 0.47
              German 0.22 0.06 0.08 0.64 0.02 0.38 0.03 0.12 0.01 0.03 0.41 0.55 0.45
              Basque 0.19 0.04 0.02 0.75 0.00 0.38 0.01 0.07 0.00 0.01 0.53 0.54 0.46
              Chinese 0.18 0.00 0.15 0.67 0.00 0.74 0.00 0.19 0.00 0.03 0.04 0.62 0.38
              .
              . import delimited "gene.data", delimiter(space)
              (encoding automatically selected: ISO-8859-1)
              (14 vars, 26 obs)
              
              . describe
              
              Contains data
               Observations:            26                  
                  Variables:            14                  
              ------------------------------------------------------------------------------------------------
              Variable      Storage   Display    Value
                  name         type    format    label      Variable label
              ------------------------------------------------------------------------------------------------
              v1              str14   %14s                  
              v2              float   %9.0g                
              v3              float   %9.0g                
              v4              float   %9.0g                
              v5              float   %9.0g                
              v6              float   %9.0g                
              v7              float   %9.0g                
              v8              float   %9.0g                
              v9              float   %9.0g                
              v10             float   %9.0g                
              v11             float   %9.0g                
              v12             float   %9.0g                
              v13             float   %9.0g                
              v14             float   %9.0g                
              ------------------------------------------------------------------------------------------------
              Sorted by:
                   Note: Dataset has changed since last saved.
              
              . list in 1/5, clean
              
                          v1    v2    v3    v4    v5    v6    v7    v8    v9   v10   v11   v12   v13   v14  
                1.    French   .21   .06   .06   .67     0   .43   .01   .14   .01   .02   .39   .55   .45  
                2.     Czech   .25   .04   .14   .57   .01   .42   .01   .15     0   .01    .4   .53   .47  
                3.    German   .22   .06   .08   .64   .02   .38   .03   .12   .01   .03   .41   .55   .45  
                4.    Basque   .19   .04   .02   .75     0   .38   .01   .07     0   .01   .53   .54   .46  
                5.   Chinese   .18     0   .15   .67     0   .74     0   .19     0   .03   .04   .62   .38  
              
              .
              Last edited by William Lisowski; 25 Jul 2021, 11:22.

              Comment


              • #8
                Originally posted by William Lisowski View Post
                I followed the link to "Stata Data Sets" and downloaded stata_data.zip. I uncompressed it and indeed saw a collection of files with the .data extension. I can open each of them in a plain text editor so they indeed are simple text files with an inappropriate extension.

                One of them is named gene.data. From Stata's File menu I chose Import > Text data and these are the results. I suspect the import delimited command generated by the dialog will work on each of these simple text files.
                Code:
                . type "gene.data", lines(5)
                French 0.21 0.06 0.06 0.67 0.00 0.43 0.01 0.14 0.01 0.02 0.39 0.55 0.45
                Czech 0.25 0.04 0.14 0.57 0.01 0.42 0.01 0.15 0.00 0.01 0.40 0.53 0.47
                German 0.22 0.06 0.08 0.64 0.02 0.38 0.03 0.12 0.01 0.03 0.41 0.55 0.45
                Basque 0.19 0.04 0.02 0.75 0.00 0.38 0.01 0.07 0.00 0.01 0.53 0.54 0.46
                Chinese 0.18 0.00 0.15 0.67 0.00 0.74 0.00 0.19 0.00 0.03 0.04 0.62 0.38
                .
                . import delimited "gene.data", delimiter(space)
                (encoding automatically selected: ISO-8859-1)
                (14 vars, 26 obs)
                
                . describe
                
                Contains data
                Observations: 26
                Variables: 14
                ------------------------------------------------------------------------------------------------
                Variable Storage Display Value
                name type format label Variable label
                ------------------------------------------------------------------------------------------------
                v1 str14 %14s
                v2 float %9.0g
                v3 float %9.0g
                v4 float %9.0g
                v5 float %9.0g
                v6 float %9.0g
                v7 float %9.0g
                v8 float %9.0g
                v9 float %9.0g
                v10 float %9.0g
                v11 float %9.0g
                v12 float %9.0g
                v13 float %9.0g
                v14 float %9.0g
                ------------------------------------------------------------------------------------------------
                Sorted by:
                Note: Dataset has changed since last saved.
                
                . list in 1/5, clean
                
                v1 v2 v3 v4 v5 v6 v7 v8 v9 v10 v11 v12 v13 v14
                1. French .21 .06 .06 .67 0 .43 .01 .14 .01 .02 .39 .55 .45
                2. Czech .25 .04 .14 .57 .01 .42 .01 .15 0 .01 .4 .53 .47
                3. German .22 .06 .08 .64 .02 .38 .03 .12 .01 .03 .41 .55 .45
                4. Basque .19 .04 .02 .75 0 .38 .01 .07 0 .01 .53 .54 .46
                5. Chinese .18 0 .15 .67 0 .74 0 .19 0 .03 .04 .62 .38
                
                .
                Thank you! I will try this out!

                Comment

                Working...
                X