Hello,
In a dataset like this:
I want to be able to order var2 and count the frequency of times var1 == 1 has a lower value in comparison to var1 == 2.
In this example dataset is pretty clear that the response is 100%, because if I sort by var2, I can easily check that the first 100 observations are var1 == 1 and the last 100 observations are var1 == 2.
However, I will be handling datasets with huge amount of var1 groups ( var1 == 1/n) and I will need to calculate the frequency of "dominance" for each group.
I'm sorry if my problem is not clear. I'm not even sure how to state what I need.
Thank you in advance,
Rafael
In a dataset like this:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input double id byte var1 double var2 1 1 -3348.331566996959 1 2 -12258.347805444035 2 1 -3385.7240619345557 2 2 -11301.4957054189 3 1 -3082.825885040739 3 2 -12374.821660737736 4 1 -3309.0077761199655 4 2 -11923.645354958966 5 1 -2363.4950444452707 5 2 -11739.878389288227 6 1 -2870.765075241693 6 2 -12223.491101587631 7 1 -3440.455118031651 7 2 -11586.180210576631 8 1 -2879.530331466697 8 2 -12411.194472623254 9 1 -3663.8540427039675 9 2 -12311.093745624179 10 1 -1943.8513780855646 10 2 -11848.575169763335 11 1 -2562.368252950966 11 2 -12486.757790837975 12 1 -3053.403217860106 12 2 -11279.761851692387 13 1 -2915.6880516384244 13 2 -11538.35238849248 14 1 -2721.273809145597 14 2 -10878.271406140717 15 1 -3130.6859456758925 15 2 -11355.943344553834 16 1 -2696.5362969727626 16 2 -11989.199934413056 17 1 -3729.6202625214105 17 2 -11662.125717279263 18 1 -2718.9885319780406 18 2 -12067.438722107863 19 1 -2710.313557039223 19 2 -10764.77844327837 20 1 -2230.475260946375 20 2 -11335.060265923423 21 1 -3902.974735628137 21 2 -12250.962858532945 22 1 -3303.7587423982905 22 2 -11591.06043152441 23 1 -3178.3873813100963 23 2 -12695.341629879618 24 1 -3810.8403785234577 24 2 -11663.350758506991 25 1 -2871.7974042681444 25 2 -12641.502334431827 26 1 -3206.1072144838154 26 2 -12607.27114065293 27 1 -2492.1329315288945 27 2 -11866.613732960985 28 1 -3195.9760630471433 28 2 -11831.669068914915 29 1 -3327.0103659714664 29 2 -12644.977265524569 30 1 -3022.603080068251 30 2 -10994.706326254727 31 1 -2978.0023660306656 31 2 -12235.86303415122 32 1 -2942.442443576155 32 2 -12473.511900603506 33 1 -2805.400236408546 33 2 -12077.562027808272 34 1 -3687.215808364619 34 2 -12197.331192554011 35 1 -3525.406469649003 35 2 -12288.98830282034 36 1 -3515.9871033834975 36 2 -12426.948061753421 37 1 -3843.07926629132 37 2 -11730.793600243547 38 1 -2653.4191486085533 38 2 -11112.137535825386 39 1 -2230.9316323818152 39 2 -11534.205230296784 40 1 -3349.0024872846225 40 2 -12436.152580542652 41 1 -3225.2432746388 41 2 -11796.185232441696 42 1 -3391.61736312115 42 2 -12801.819794989711 43 1 -2095.1826594466825 43 2 -11610.383493799716 44 1 -2315.8516760156344 44 2 -11761.863263104753 45 1 -2342.693801672127 45 2 -12201.238232981135 46 1 -3911.2950036704856 46 2 -12695.670118855902 47 1 -3689.6496802706606 47 2 -12636.210532571324 48 1 -2519.131784578982 48 2 -13130.544173341606 49 1 -2960.2125844698403 49 2 -11083.943158574133 50 1 -2793.171211062792 50 2 -12783.317898960162 end
I want to be able to order var2 and count the frequency of times var1 == 1 has a lower value in comparison to var1 == 2.
In this example dataset is pretty clear that the response is 100%, because if I sort by var2, I can easily check that the first 100 observations are var1 == 1 and the last 100 observations are var1 == 2.
However, I will be handling datasets with huge amount of var1 groups ( var1 == 1/n) and I will need to calculate the frequency of "dominance" for each group.
I'm sorry if my problem is not clear. I'm not even sure how to state what I need.
Thank you in advance,
Rafael
Comment