Merging two datasets

Giovanni Palmioli

Join Date: Apr 2016
Posts: 28

#16

07 Sep 2016, 12:46

Originally posted by Robert Picard View Post

Here's my guess at what is wanted:

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input str2 ID1 str1 ID2 str6 TIME float(Y1 Y2 Y3)
"A1" "B" "2010Q1" 1 2 3
"A1" "B" "2010Q2" 2 4 6
"A1" "B" "2010Q3" 3 6 9
"A1" "B" "2010Q4" 4 8 12
"A1" "B" "2015Q2" 5 10 15
"A1" "C" "2010Q1" 6 12 18
"A1" "C" "2010Q2" 7 14 21
"A1" "C" "2010Q3" 8 16 24
"A1" "C" "2010Q4" 9 18 27
"A1" "C" "2015Q2" 10 20 30
"A2" "B" "2010Q1" 11 22 33
"A2" "B" "2010Q2" 12 24 36
"A2" "B" "2010Q3" 13 26 39
"A2" "B" "2010Q4" 14 28 42
"A2" "B" "2015Q2" 15 30 45
"A2" "C" "2010Q1" 16 32 48
"A2" "C" "2010Q2" 17 34 51
"A2" "C" "2010Q3" 18 36 54
"A2" "C" "2010Q4" 19 38 57
"A2" "C" "2015Q2" 20 40 60
end
save "statalist_data1.dta", replace

* save a list of unique pairs of identifiers in dataset1
bysort ID1 ID2: keep if _n == 1
keep ID1 ID2
save "statalist_data1b.dta", replace
list, sepby(ID1)

* Example generated by -dataex-. To install: ssc install dataex
clear
input str2 ID1 str6 TIME float(Z1 Z2 Z3)
"A1" "1999Q4" -1 -2 -3
"A1" "2000Q1" -2 -4 -6
"A1" "2000Q2" -3 -6 -9
"A1" "2016Q2" -4 -8 -12
"A2" "1999Q4" -5 -10 -15
"A2" "2000Q1" -6 -12 -18
"A2" "2000Q2" -7 -14 -21
"A2" "2016Q2" -8 -16 -24
end
save "statalist_data2.dta", replace

* create pairwise combinations using unique ID1 ID2 codes
joinby ID1 using "statalist_data1b.dta"
list, sepby(ID1)

* match observations by ids and time code
merge 1:1 ID1 ID2 TIME using "statalist_data1.dta"

sort ID1 ID2 TIME
order ID1 ID2 TIME Y1 Y2 Y3 Z1 Z2 Z3
list, sepby(ID1)

Thanks Robert for your input. I tried that and got an error message saying that ID1 ID2 TIME do not uniquely identify the observations in the master dataset,

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30177
#17

07 Sep 2016, 12:54

Well, it would take me way too long to try to import the data into Stata from the format you have shown, so I can't try running this to be sure I"m right, but to my eye this looks like:

Code:

use dataset1, clear merge 1:1 id1 year quarter using dataset2
Comment

Giovanni Palmioli

Join Date: Apr 2016
Posts: 28

#18

08 Sep 2016, 02:05

Here it is.

I tried

Code:

merge 1:1 id1 year quarter using "dataset2.dta"

as suggested but got the error message "variables id1 year quarter do not uniquely identify observations in the master data"

I also tried

Code:

merge m:1 id1 time year quarter using "dataset1"

but only the observations with the common time frame got matched. Thus, 2012 and 2016 don't appear in the merged dataset.

Code:

* Example generated by -dataex-. To install: ssc install dataex
clear
input int id1 str3 id2n int id2 float(year quarter) double(y1 y2 y3)
1005 "AAA" 7005 2013 1   .    .     .
1005 "AAA" 7005 2013 2 189 1082 19212
1005 "AAA" 7005 2013 3   .    .     .
1005 "AAA" 7005 2013 4 697 1010 63602
1005 "AAA" 7005 2014 1   .    .     .
1005 "AAA" 7005 2014 2 219  200 50025
1005 "AAA" 7005 2014 3  84   56   100
1005 "AAA" 7005 2014 4  58   97  5623
1005 "AAA" 7005 2015 1   .    .     .
1005 "AAA" 7005 2015 2 984   31 11638
1005 "BBB" 8005 2013 1 114  328   203
1005 "BBB" 8005 2013 2 721  163   323
1005 "BBB" 8005 2013 3 447  723    12
1005 "BBB" 8005 2013 4 549  239   220
1005 "BBB" 8005 2014 1 193   29   388
1005 "BBB" 8005 2014 2 308   75   189
1005 "BBB" 8005 2014 3  82  588   414
1005 "BBB" 8005 2014 4 176   61   147
1005 "BBB" 8005 2015 1 170  809   581
1005 "BBB" 8005 2015 2 301  207   409
end
save "dataset1.dta"

* Example generated by -dataex-. To install: ssc install dataex
clear
input int id1 float(year quarter) double(z1 z2 z3)
1005 2012 1  96  49  25
1005 2012 2  92   9 158
1005 2012 3  87   3  94
1005 2012 4  47 129 201
1005 2013 1   .   .   .
1005 2013 2   7 146 221
1005 2013 3   .   .   .
1005 2013 4  23  69  92
1005 2014 1   1  57  38
1005 2014 2   .   .   .
1005 2014 3 143  35  97
1005 2014 4 136  69 108
1005 2015 1  81  40 203
1005 2015 2  67  56  39
1005 2015 3 187 161  98
1005 2015 4   1  34 204
1005 2016 1  51 107 143
1005 2016 2  23  45  33
end
save "dataset2.dta"

Last edited by Giovanni Palmioli; 08 Sep 2016, 02:23.

Comment

Giovanni Palmioli

Join Date: Apr 2016

Posts: 28
#19

08 Sep 2016, 02:09

Originally posted by Clyde Schechter View Post

Well, it would take me way too long to try to import the data into Stata from the format you have shown, so I can't try running this to be sure I"m right, but to my eye this looks like:

Code:

use dataset1, clear merge 1:1 id1 year quarter using dataset2

Clyde,

I tried that but got the error message "variables id1 year quarter do not uniquely identify observations in the master data"
Please see the post above for the dataex import

Last edited by Giovanni Palmioli; 08 Sep 2016, 02:15.
Comment

Jesse Wursten

Join Date: Jan 2016
Posts: 915

#20

08 Sep 2016, 02:38

Code:

use dataset1.dta, clear
merge m:1 id1 year quarter using dataset2.dta
expand 2 if missing(id2), gen(newobs)
replace id2 = 7005 if missing(id2) & newobs == 0
replace id2 = 8005 if missing(id2) & newobs == 1

This seems to work?

Comment

Oded Mcdossi

Join Date: Jun 2014
Posts: 577

#21

08 Sep 2016, 03:00

I guess you have more than 2 values in id2, so this seems to handle that with different approach:

Code:

use data1, clear
levelsof id2, local (levels)
    foreach i of local levels {
        append using data2, gen(ID2_`i')
        bys id1: replace id2n=id2n[_n-1] if mi(id2n) & ID2_`i'==1
        bys id1: replace id2=id2[_n-1] if mi(id2) & ID2_`i'==1
    drop ID2_`i'
    drop if mi(id2)
}    
collapse (max) y1-y3 z1-z3, by(id1 id2 id2n year quarter)

Comment

Giovanni Palmioli

Join Date: Apr 2016

Posts: 28
#22

08 Sep 2016, 04:10

Originally posted by Oded Mcdossi View Post

I guess you have more than 2 values in id2, so this seems to handle that with different approach:

Code:

use data1, clear levelsof id2, local (levels) foreach i of local levels { append using data2, gen(ID2_`i') bys id1: replace id2n=id2n[_n-1] if mi(id2n) & ID2_`i'==1 bys id1: replace id2=id2[_n-1] if mi(id2) & ID2_`i'==1 drop ID2_`i' drop if mi(id2) } collapse (max) y1-y3 z1-z3, by(id1 id2 id2n year quarter)

Thank you Oded. Indeed I have hundreds of values in id2.
I just tried your code and noticed that some data (from dataset 2) are left out....do you know why?
Comment
Giovanni Palmioli

Join Date: Apr 2016

Posts: 28
#23

08 Sep 2016, 04:12

Originally posted by Jesse Wursten View Post

Code:

use dataset1.dta, clear merge m:1 id1 year quarter using dataset2.dta expand 2 if missing(id2), gen(newobs) replace id2 = 7005 if missing(id2) & newobs == 0 replace id2 = 8005 if missing(id2) & newobs == 1

This seems to work?

Thanks Jesse, but I have hundreds of values in id2 so this solution is not practical for my data...
Comment
Oded Mcdossi

Join Date: Jun 2014

Posts: 577
#24

08 Sep 2016, 04:30

I haven't seen data2 except the example you post here, so I can't say much more than just guess that this is a result of ID1 values in data2 that do not appear at data1 and that is fine. The append approach uses all the data in dataset2 and then drop the non-relevant observations if they didn't match to id1 in dataset1. I recommend to simplify the code and try it only on those problematic id's.
Comment
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#25

08 Sep 2016, 07:07

Originally posted by Giovanni Palmioli View Post

Thanks Jesse, but I have hundreds of values in id2 so this solution is not practical for my data...

Are they the same for each id1?
Comment
Giovanni Palmioli

Join Date: Apr 2016

Posts: 28
#26

08 Sep 2016, 07:24

Originally posted by Jesse Wursten View Post

Are they the same for each id1?

Dataset 1 has hundreds id1 values while dataset 2 has a subset of those. I hope this answered your question.
Comment
Jesse Wursten

Join Date: Jan 2016

Posts: 915
#27

08 Sep 2016, 07:29

Originally posted by Giovanni Palmioli View Post

Dataset 1 has hundreds id1 values while dataset 2 has a subset of those. I hope this answered your question.

I mean, your 2nd dataset only has id1. Your first dataset has id1 and id2. If I understood correctly, you want the values from dataset two to match to id1 for each id2 value. E.g. in the example for each id1, you want the data from dataset two to show for both id2 7005 and 8005.

This is not an issue for the time periods where both datasets have observations, as merge m:1 takes care of that. But for those only present in dataset two, you need to do this "manually". But we can't help you how to do this, if you don't tell us the relation between id1 and id2. Does each id1 have the same id2 values? Or does it differ per id1?
Comment
Giovanni Palmioli

Join Date: Apr 2016

Posts: 28
#28

08 Sep 2016, 07:51

Originally posted by Jesse Wursten View Post

I mean, your 2nd dataset only has id1. Your first dataset has id1 and id2. If I understood correctly, you want the values from dataset two to match to id1 for each id2 value. E.g. in the example for each id1, you want the data from dataset two to show for both id2 7005 and 8005.

This is not an issue for the time periods where both datasets have observations, as merge m:1 takes care of that. But for those only present in dataset two, you need to do this "manually". But we can't help you how to do this, if you don't tell us the relation between id1 and id2. Does each id1 have the same id2 values? Or does it differ per id1?

Variable id1 takes many values in both dataset 1 and 2. Similarly, id2 takes many values in dataset1. I only showed the data with one id1 value (ie. 1005) and 2 id2 values (ie 7005 and 8005).
Comment

Andrew Musau

Join Date: Oct 2014
Posts: 10291

#29

11 Sep 2016, 12:41

With a few tweaks to Robert's code in #14, you can get what you want in #15. In future as per the FAQs, always present data that is representative of your problem, in a way that is accessible (using dataex) and at the start of the thread.

Code:

 * Example generated by -dataex-. To install: ssc install dataex
clear
input int id1 str3 id2n int id2 float(year quarter) double(y1 y2 y3)
1005 "AAA" 7005 2013 1   .    .     .
1005 "AAA" 7005 2013 2 189 1082 19212
1005 "AAA" 7005 2013 3   .    .     .
1005 "AAA" 7005 2013 4 697 1010 63602
1005 "AAA" 7005 2014 1   .    .     .
1005 "AAA" 7005 2014 2 219  200 50025
1005 "AAA" 7005 2014 3  84   56   100
1005 "AAA" 7005 2014 4  58   97  5623
1005 "AAA" 7005 2015 1   .    .     .
1005 "AAA" 7005 2015 2 984   31 11638
1005 "BBB" 8005 2013 1 114  328   203
1005 "BBB" 8005 2013 2 721  163   323
1005 "BBB" 8005 2013 3 447  723    12
1005 "BBB" 8005 2013 4 549  239   220
1005 "BBB" 8005 2014 1 193   29   388
1005 "BBB" 8005 2014 2 308   75   189
1005 "BBB" 8005 2014 3  82  588   414
1005 "BBB" 8005 2014 4 176   61   147
1005 "BBB" 8005 2015 1 170  809   581
1005 "BBB" 8005 2015 2 301  207   409
end
save "dataset1.dta"

* Example generated by -dataex-. To install: ssc install dataex
clear
input int id1 float(year quarter) double(z1 z2 z3)
1005 2012 1  96  49  25
1005 2012 2  92   9 158
1005 2012 3  87   3  94
1005 2012 4  47 129 201
1005 2013 1   .   .   .
1005 2013 2   7 146 221
1005 2013 3   .   .   .
1005 2013 4  23  69  92
1005 2014 1   1  57  38
1005 2014 2   .   .   .
1005 2014 3 143  35  97
1005 2014 4 136  69 108
1005 2015 1  81  40 203
1005 2015 2  67  56  39
1005 2015 3 187 161  98
1005 2015 4   1  34 204
1005 2016 1  51 107 143
1005 2016 2  23  45  33
end
save "dataset2.dta"

Code:

use dataset1, clear
bysort id1 id2 id2n: keep if _n == 1
keep id1 id2 id2n
save "statalist_data1b.dta", replace
list, sepby(id1)
use dataset2
joinby id1 using "statalist_data1b.dta"
list, sepby(id1)
merge m:m id1 id2 id2n year using "dataset1"
drop _merge
order id1 id2n id2 year quarter y1 y2 y3 z1 z2 z3
sort id1 id2 id2n year quarter
l, sepby(id1)

Code:

. l, sepby(id1)

     +----------------------------------------------------------------------------+
     |  id1   id2n    id2   year   quarter    y1     y2      y3    z1    z2    z3 |
     |----------------------------------------------------------------------------|
  1. | 1005    AAA   7005   2012         1     .      .       .    96    49    25 |
  2. | 1005    AAA   7005   2012         2     .      .       .    92     9   158 |
  3. | 1005    AAA   7005   2012         3     .      .       .    87     3    94 |
  4. | 1005    AAA   7005   2012         4     .      .       .    47   129   201 |
  5. | 1005    AAA   7005   2013         1     .      .       .     .     .     . |
  6. | 1005    AAA   7005   2013         2   697   1010   63602     7   146   221 |
  7. | 1005    AAA   7005   2013         3     .      .       .     .     .     . |
  8. | 1005    AAA   7005   2013         4   189   1082   19212    23    69    92 |
  9. | 1005    AAA   7005   2014         1    58     97    5623     1    57    38 |
 10. | 1005    AAA   7005   2014         2     .      .       .     .     .     . |
 11. | 1005    AAA   7005   2014         3   219    200   50025   143    35    97 |
 12. | 1005    AAA   7005   2014         4    84     56     100   136    69   108 |
 13. | 1005    AAA   7005   2015         1     .      .       .    81    40   203 |
 14. | 1005    AAA   7005   2015         2   984     31   11638    67    56    39 |
 15. | 1005    AAA   7005   2015         3   984     31   11638   187   161    98 |
 16. | 1005    AAA   7005   2015         4   984     31   11638     1    34   204 |
 17. | 1005    AAA   7005   2016         1     .      .       .    51   107   143 |
 18. | 1005    AAA   7005   2016         2     .      .       .    23    45    33 |
 19. | 1005    BBB   8005   2012         1     .      .       .    96    49    25 |
 20. | 1005    BBB   8005   2012         2     .      .       .    92     9   158 |
 21. | 1005    BBB   8005   2012         3     .      .       .    87     3    94 |
 22. | 1005    BBB   8005   2012         4     .      .       .    47   129   201 |
 23. | 1005    BBB   8005   2013         1   549    239     220     .     .     . |
 24. | 1005    BBB   8005   2013         2   447    723      12     7   146   221 |
 25. | 1005    BBB   8005   2013         3   721    163     323     .     .     . |
 26. | 1005    BBB   8005   2013         4   114    328     203    23    69    92 |
 27. | 1005    BBB   8005   2014         1    82    588     414     1    57    38 |
 28. | 1005    BBB   8005   2014         2   308     75     189     .     .     . |
 29. | 1005    BBB   8005   2014         3   176     61     147   143    35    97 |
 30. | 1005    BBB   8005   2014         4   193     29     388   136    69   108 |
 31. | 1005    BBB   8005   2015         1   301    207     409    81    40   203 |
 32. | 1005    BBB   8005   2015         2   301    207     409    67    56    39 |
 33. | 1005    BBB   8005   2015         3   170    809     581   187   161    98 |
 34. | 1005    BBB   8005   2015         4   301    207     409     1    34   204 |
 35. | 1005    BBB   8005   2016         1     .      .       .    51   107   143 |
 36. | 1005    BBB   8005   2016         2     .      .       .    23    45    33 |
     +----------------------------------------------------------------------------+

.

Last edited by Andrew Musau; 11 Sep 2016, 12:43.

Comment

Robert Picard

Join Date: Mar 2014
Posts: 1536

#30

11 Sep 2016, 13:56

I have pleaded many times on this list that StataCorp make the m:m merge option illegal. It is never appropriate and #29 is not an example of the elusive exception to the rule. Note that the results in #29 do not match what's showed in #15 (look at 2013q2 and 2013q4 for id1==1005).

Giovanni changed variable type for the year and quarter variable in #15 and claimed in #16 that

I tried that and got an error message saying that ID1 ID2 TIME do not uniquely identify the observations in the master dataset

My guess is that he did not adapt my code correctly. Leveraging Andrew's efforts, here's my solution, adapted to the data from #15. I have added code that checks for duplicates in both dataset.

Code:

 * Example generated by -dataex-. To install: ssc install dataex
clear
input int id1 str3 id2n int id2 float(year quarter) double(y1 y2 y3)
1005 "AAA" 7005 2013 1   .    .     .
1005 "AAA" 7005 2013 2 189 1082 19212
1005 "AAA" 7005 2013 3   .    .     .
1005 "AAA" 7005 2013 4 697 1010 63602
1005 "AAA" 7005 2014 1   .    .     .
1005 "AAA" 7005 2014 2 219  200 50025
1005 "AAA" 7005 2014 3  84   56   100
1005 "AAA" 7005 2014 4  58   97  5623
1005 "AAA" 7005 2015 1   .    .     .
1005 "AAA" 7005 2015 2 984   31 11638
1005 "BBB" 8005 2013 1 114  328   203
1005 "BBB" 8005 2013 2 721  163   323
1005 "BBB" 8005 2013 3 447  723    12
1005 "BBB" 8005 2013 4 549  239   220
1005 "BBB" 8005 2014 1 193   29   388
1005 "BBB" 8005 2014 2 308   75   189
1005 "BBB" 8005 2014 3  82  588   414
1005 "BBB" 8005 2014 4 176   61   147
1005 "BBB" 8005 2015 1 170  809   581
1005 "BBB" 8005 2015 2 301  207   409
end
isid id1 id2 year quarter, sort
save "statalist_data1.dta", replace

* save a list of identifiers dyads, I assume id2n is a numeric encoding of id2
bysort id1 id2 id2n: keep if _n == 1
keep id1 id2 id2n
save "statalist_data1b.dta", replace
list, sepby(id1)

* Example generated by -dataex-. To install: ssc install dataex
clear
input int id1 float(year quarter) double(z1 z2 z3)
1005 2012 1  96  49  25
1005 2012 2  92   9 158
1005 2012 3  87   3  94
1005 2012 4  47 129 201
1005 2013 1   .   .   .
1005 2013 2   7 146 221
1005 2013 3   .   .   .
1005 2013 4  23  69  92
1005 2014 1   1  57  38
1005 2014 2   .   .   .
1005 2014 3 143  35  97
1005 2014 4 136  69 108
1005 2015 1  81  40 203
1005 2015 2  67  56  39
1005 2015 3 187 161  98
1005 2015 4   1  34 204
1005 2016 1  51 107 143
1005 2016 2  23  45  33
end
isid id1 year quarter, sort

* create pairwise combinations using unique id1 id2 codes
joinby id1 using "statalist_data1b.dta"
list, sepby(id1)

* match observations by ids and time code
merge 1:1 id1 id2 year quarter using "statalist_data1.dta"

sort id1 id2 year quarter
order id1 id2n id2 year quarter y1 y2 y3 z1 z2 z3
list, sepby(id1)

and the results

Code:

. list, sepby(id1)

     +----------------------------------------------------------------------------------------------+
     |  id1   id2n    id2   year   quarter    y1     y2      y3    z1    z2    z3            _merge |
     |----------------------------------------------------------------------------------------------|
  1. | 1005    AAA   7005   2012         1     .      .       .    96    49    25   master only (1) |
  2. | 1005    AAA   7005   2012         2     .      .       .    92     9   158   master only (1) |
  3. | 1005    AAA   7005   2012         3     .      .       .    87     3    94   master only (1) |
  4. | 1005    AAA   7005   2012         4     .      .       .    47   129   201   master only (1) |
  5. | 1005    AAA   7005   2013         1     .      .       .     .     .     .       matched (3) |
  6. | 1005    AAA   7005   2013         2   189   1082   19212     7   146   221       matched (3) |
  7. | 1005    AAA   7005   2013         3     .      .       .     .     .     .       matched (3) |
  8. | 1005    AAA   7005   2013         4   697   1010   63602    23    69    92       matched (3) |
  9. | 1005    AAA   7005   2014         1     .      .       .     1    57    38       matched (3) |
 10. | 1005    AAA   7005   2014         2   219    200   50025     .     .     .       matched (3) |
 11. | 1005    AAA   7005   2014         3    84     56     100   143    35    97       matched (3) |
 12. | 1005    AAA   7005   2014         4    58     97    5623   136    69   108       matched (3) |
 13. | 1005    AAA   7005   2015         1     .      .       .    81    40   203       matched (3) |
 14. | 1005    AAA   7005   2015         2   984     31   11638    67    56    39       matched (3) |
 15. | 1005    AAA   7005   2015         3     .      .       .   187   161    98   master only (1) |
 16. | 1005    AAA   7005   2015         4     .      .       .     1    34   204   master only (1) |
 17. | 1005    AAA   7005   2016         1     .      .       .    51   107   143   master only (1) |
 18. | 1005    AAA   7005   2016         2     .      .       .    23    45    33   master only (1) |
 19. | 1005    BBB   8005   2012         1     .      .       .    96    49    25   master only (1) |
 20. | 1005    BBB   8005   2012         2     .      .       .    92     9   158   master only (1) |
 21. | 1005    BBB   8005   2012         3     .      .       .    87     3    94   master only (1) |
 22. | 1005    BBB   8005   2012         4     .      .       .    47   129   201   master only (1) |
 23. | 1005    BBB   8005   2013         1   114    328     203     .     .     .       matched (3) |
 24. | 1005    BBB   8005   2013         2   721    163     323     7   146   221       matched (3) |
 25. | 1005    BBB   8005   2013         3   447    723      12     .     .     .       matched (3) |
 26. | 1005    BBB   8005   2013         4   549    239     220    23    69    92       matched (3) |
 27. | 1005    BBB   8005   2014         1   193     29     388     1    57    38       matched (3) |
 28. | 1005    BBB   8005   2014         2   308     75     189     .     .     .       matched (3) |
 29. | 1005    BBB   8005   2014         3    82    588     414   143    35    97       matched (3) |
 30. | 1005    BBB   8005   2014         4   176     61     147   136    69   108       matched (3) |
 31. | 1005    BBB   8005   2015         1   170    809     581    81    40   203       matched (3) |
 32. | 1005    BBB   8005   2015         2   301    207     409    67    56    39       matched (3) |
 33. | 1005    BBB   8005   2015         3     .      .       .   187   161    98   master only (1) |
 34. | 1005    BBB   8005   2015         4     .      .       .     1    34   204   master only (1) |
 35. | 1005    BBB   8005   2016         1     .      .       .    51   107   143   master only (1) |
 36. | 1005    BBB   8005   2016         2     .      .       .    23    45    33   master only (1) |
     +----------------------------------------------------------------------------------------------+

Giovanni, please do not post any more data in the format you seem to favor. Use dataex, it's very easy to install. All you need to do is to type in Stata's command window:

Code:

ssc install dataex

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment