Hi all,
I have data:
The data contains three variables: the value of tempjan by region and agecat. How can I generate a matrix that contains the value of tempjan as like in a two-way contingency table?
The content of the matrix should look like the inner part of the table below (in bold)
I am asking this question because I want to use plotmatrix to visualize my data, and this command requires us to generate a matrix first. My data is not really based on contingency table, i.e. the third variable tempjan does not refer to the frequency or percentage of any other variables, so I cannot use twoway tab here. As you can see above, I tried tab, summarize(), but it has two problems: I do not know how to save the results as a matrix; I do not know how to get rid of the "Total" row and column on the sides of the table.
So I think I have the following questions:
1. transform the data into a contingency-table-like format.
2. save it as a matrix
3. use plotmatrix to plot this matrix. However, I am not sure whether plotmatrix is the right command here. How can I make it recognize that the row and column corresponds to the different categories of region and agecat?
4. Also, I would really appreciate if anybody could tell me whether I could plot multiple matrices and combine them into the same graph using plotmatrix (something like addplot)? I do not see such an extension available in its help file.
I also considered tabplot. But I feel it occupies too much space on a single page, if I want to show multiple contingency tables together (the many bars take space, while plotmatrix can convey the information using shaded colors). Also, it is based on cross-tabulation while my data is not really based on that (it assumes you want to show the fractions/percent of the third variable within each category defined by two given vars, while what I want to show is the original value of a third variable). However, if plotmatrix does not work, I can also use tabplot. But then I do not know how to transform my data into a cross-tabulation-like format so that tabplot can be executed on it. The value of my variable of interest tempjan is not integers so I cannot expand the data.
Thank you!
I have data:
Code:
* Example generated by -dataex-. To install: ssc install dataex clear input int region float(agecat tempjan) 1 1 23.16087 1 2 28.99518 1 3 31.462856 2 1 18.091358 2 2 25.40435 2 3 29.77333 3 1 38.58561 3 2 50.13235 3 3 64.27907 4 1 40.23812 4 2 54.53425 4 3 61.50435 end label values region region label def region 1 "NE", modify label def region 2 "N Cntrl", modify label def region 3 "South", modify label def region 4 "West", modify label values agecat agecat label def agecat 1 "19-29", modify label def agecat 2 "30-34", modify label def agecat 3 "35+", modify
The content of the matrix should look like the inner part of the table below (in bold)
Code:
tabulate region agecat, summarize(tempjan) means Means of Average January temperature Census | agecat Region | 19-29 30-34 35+ | Total -----------+---------------------------------+---------- NE | 23.16087 28.995181 31.462857 | 27.885366 N Cntrl | 18.091358 25.404348 29.773333 | 21.694366 South | 38.585612 50.132353 64.279069 | 46.1456 West | 40.238125 54.534247 61.504348 | 46.225391 -----------+---------------------------------+---------- Total | 31.159172 38.398101 47.122137 | 35.748952
So I think I have the following questions:
1. transform the data into a contingency-table-like format.
2. save it as a matrix
3. use plotmatrix to plot this matrix. However, I am not sure whether plotmatrix is the right command here. How can I make it recognize that the row and column corresponds to the different categories of region and agecat?
4. Also, I would really appreciate if anybody could tell me whether I could plot multiple matrices and combine them into the same graph using plotmatrix (something like addplot)? I do not see such an extension available in its help file.
I also considered tabplot. But I feel it occupies too much space on a single page, if I want to show multiple contingency tables together (the many bars take space, while plotmatrix can convey the information using shaded colors). Also, it is based on cross-tabulation while my data is not really based on that (it assumes you want to show the fractions/percent of the third variable within each category defined by two given vars, while what I want to show is the original value of a third variable). However, if plotmatrix does not work, I can also use tabplot. But then I do not know how to transform my data into a cross-tabulation-like format so that tabplot can be executed on it. The value of my variable of interest tempjan is not integers so I cannot expand the data.
Thank you!
Comment