Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Cross tabulation for 4 variables with 900 observations

    Hi all,

    I have been trying to figure this one out from a Stata manual on crosstabulation but the result I am getting is lengthy (9 pages) when I used the command . table result visit eweid udderhalf . Another command . table result visit eweid udderhalf returns `too many variables specified`. Below is the description of my dataset and what I would like to do:

    I have a dataset where milk samples were collected from each half (left and right) of the udder of sheep at 2 sampling visits (1 and 2) and cultured for bacterial isolation. Each of the sheep unique id (eg, 111675) appeared 4 times in the dataset and the bacteriology result contains 6 different pathogens and the negative, there are empty cells as well for a situation where no milk was found in an udder half. Udder half and bacteriology result are presented as string variables. I want to produce a crosstabulation in order to investigate whether the bacteriology result of a particular udder half of a particular animal was of the same status during first and second visit or has changed in-between. Is there any further statistics that I can run to answer my question?

    Best regards,

    Aminu.

  • #2
    Sorry, the first command was . table result visit eweid. Apologies for the confusion.

    Comment


    • #3
      I have some ideas, but lacking data to test on, I'm not certain if they will meet your needs. At heart, I don't think crosstabulation is what you want here, but without better understanding your data, I cannot give more specific recommendations. I suspect this is true for other readers as well.

      Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. See especially sections 9-12 on how to best pose your question. It would be helpful to post a small example with all the data for just a few of the sheep. In particular, please read FAQ #12 and use dataex and CODE delimiters when posting to Statalist.

      Comment


      • #4
        I am not sure whether I understand your problem correctly. If there are two visits (1, 2) per sheep (= eweid) per udderhalf (1, 2) and "result" can have any value, the following example might illustrate a solution you are looking for:
        Code:
        clear
        input eweid visit udderhalf result
          111675 1 1 3
          111675 1 2 6
          111675 2 1 3
          111675 2 2 5
          111676 1 1 2
          111676 1 2 4
          111676 2 1 2
          111676 2 2 4
          111677 1 1 1
          111677 1 2 5
          111677 2 1 2
          111677 2 2 5
        end
        sort eweid udderhalf visit
        * Show first 12 cases:
        list if _n <= 12, sepby(eweid)
        
        * Per udderhalf convert long data to wide data and save in temporary file:
        preserve
          tempfile udder1
          keep if udderhalf == 1
          reshape wide result, i(eweid) j(visit)
          save `udder1'
        restore
        preserve
          tempfile udder2
          keep if udderhalf == 2
          reshape wide result, i(eweid) j(visit)
          save `udder2'
        restore
        
        * Append both temporary files and sort by eweid and udderhalf:
        clear
        append using `udder1' `udder2'
        sort eweid udderhalf
        
        * Show first 6 cases:
        list if _n <= 6, sepby(eweid)
        
        * --------------------------------------------------
        * List cases with changes in result between visits:
        list eweid udderhalf result1 result2 if result1 != result2, sepby(eweid)
        
        * Frequency of result1 and result2 if result changes between visits:
        tab1 result1 result2 if result2 != result1
        
        * Cross-tabulation of results if result differs between visits:
        tab2 result2 result1 if result2 != result1
        Last edited by Dirk Enzmann; 15 Sep 2016, 11:00.

        Comment


        • #5
          Hi Dirk Enzmann,

          Thanks for your guide. I tried following it, but at some points I was receiving error messages. I am not sure whether I have done something wrong, pleaes advise further. See example below.


          . *(4 variables, 912 observations pasted into data editor)

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . sort eweid udderhalf visit

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . list if _n <= 12, sepby(eweid)

          +-----------------------------------+
          | visit eweid udderh~f result |
          |-----------------------------------|
          1. | 1 159 1 5 |
          2. | 2 159 1 . |
          3. | 1 159 2 . |
          4. | 2 159 2 1 |
          |-----------------------------------|
          5. | 1 1539 1 1 |
          6. | 2 1539 1 1 |
          7. | 1 1539 2 . |
          8. | 2 1539 2 . |
          |-----------------------------------|
          9. | 1 6807 1 . |
          10. | 2 6807 1 . |
          11. | 1 6807 2 2 |
          12. | 2 6807 2 7 |
          +-----------------------------------+

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . preserve

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . tempfile udder1

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . keep if udderhalf == 1
          (456 observations deleted)

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . reshape wide result, i(eweid) j(visit)
          (note: j = 1 2)

          Data long -> wide
          -----------------------------------------------------------------------------
          Number of obs. 456 -> 228
          Number of variables 4 -> 4
          j variable (2 values) visit -> (dropped)
          xij variables:
          result -> result1 result2
          -----------------------------------------------------------------------------

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . save `udder1'
          invalid file specification
          r(198);

          end of do-file

          r(198);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . save `udder1'
          invalid file specification
          r(198);

          end of do-file

          r(198);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . save `udder1''
          file '.dta saved

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . restore
          nothing to restore
          r(622);

          end of do-file

          r(622);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . preserve

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . tempfile udder2

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . keep if udderhalf == 2
          (228 observations deleted)

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . reshape wide result, i(eweid) j(visit)
          variable visit not found
          Data are already wide.
          r(111);

          end of do-file

          r(111);

          . clear

          . edit

          . *(4 variables, 912 observations pasted into data editor)

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . sort eweid udderhalf visit

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . list if _n <= 12, sepby(eweid)

          +-----------------------------------+
          | visit eweid udderh~f result |
          |-----------------------------------|
          1. | 1 159 1 5 |
          2. | 2 159 1 . |
          3. | 1 159 2 . |
          4. | 2 159 2 1 |
          |-----------------------------------|
          5. | 1 1539 1 1 |
          6. | 2 1539 1 1 |
          7. | 1 1539 2 . |
          8. | 2 1539 2 . |
          |-----------------------------------|
          9. | 1 6807 1 . |
          10. | 2 6807 1 . |
          11. | 1 6807 2 2 |
          12. | 2 6807 2 7 |
          +-----------------------------------+

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . preserve

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . tempfile udder1

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . keep if udderhalf == 1
          (456 observations deleted)

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . reshape wide result, i(eweid) j(visit)
          (note: j = 1 2)

          Data long -> wide
          -----------------------------------------------------------------------------
          Number of obs. 456 -> 228
          Number of variables 4 -> 4
          j variable (2 values) visit -> (dropped)
          xij variables:
          result -> result1 result2
          -----------------------------------------------------------------------------

          .
          end of do-file

          . edit

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . save `udder1''
          file '.dta already exists
          r(602);

          end of do-file

          r(602);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . restore
          nothing to restore
          r(622);

          end of do-file

          r(622);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . preserve

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . tempfile udder2

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . keep if udderhalf == 2
          (228 observations deleted)

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . reshape wide result, i(eweid) j(visit)
          variable visit not found
          Data are already wide.
          r(111);

          end of do-file

          r(111);

          . clear

          . edit

          . *(4 variables, 912 observations pasted into data editor)

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . preserve

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . tempfile udder2

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . keep if udderhalf == 2
          (456 observations deleted)

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . reshape wide result, i(eweid) j(visit)
          (note: j = 1 2)

          Data long -> wide
          -----------------------------------------------------------------------------
          Number of obs. 456 -> 228
          Number of variables 4 -> 4
          j variable (2 values) visit -> (dropped)
          xij variables:
          result -> result1 result2
          -----------------------------------------------------------------------------

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . save `udder2'
          invalid file specification
          r(198);

          end of do-file

          r(198);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . save `udder2''
          file '.dta already exists
          r(602);

          end of do-file

          r(602);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . drop save `udder2''
          variable save not found
          r(111);

          end of do-file

          r(111);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . drop `udder2''
          ' invalid name
          r(198);

          end of do-file

          r(198);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . drop `udder2'
          varlist or in range required
          r(100);

          end of do-file

          r(100);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . drop udder2
          variable udder2 not found
          r(111);

          end of do-file

          r(111);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . drop udder2.dta
          factor variables and time-series operators not allowed
          r(101);

          end of do-file

          r(101);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . save `udder2''
          file '.dta already exists
          r(602);

          end of do-file

          r(602);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . restore
          nothing to restore
          r(622);

          end of do-file

          r(622);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . clear

          .
          end of do-file

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . append using `udder1' `udder2'
          invalid file specification
          r(198);

          end of do-file

          r(198);

          . do "C:\Users\Shitt001\AppData\Local\Temp\STD01000000. tmp"

          . sort eweid udderhalf
          no variables defined
          r(111);

          end of do-file

          r(111);

          Comment


          • #6
            Your immediate problem is at this point:
            Code:
            . save `udder1'
            invalid file specification
            r(198);
            The problem is that you are running the commands one line at a time from the do-file editor. That will not work with any program, like Dirk's, that makes use of local macro variables like udder1 and udder2. Each time you run a line it runs in a separate (temporary) do file, so the local macro variable defined by
            Code:
            tempfile udder1
            immediately vanishes, and then
            Code:
            save `udder1'
            is interpreted, since udder1 is now undefined, as
            Code:
            save
            which is an error, since no file is specified.

            When you paste Dirk's code into the do-file editor, then without selecting any individual lines, run the entire do-file at once.

            Comment

            Working...
            X