Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Potential error in specifying loop using local variables

    Hello,

    I have a dataset where one variable is a locality area. The variable is called ICS22CD, these localities are coded as QKK QKM QKQ etc. There are some 40 or so localities, each with their own code. For each locality there is a cost amount: I need to multiply the appropriate population for each ICS22CD code by the cost amount.

    I wrote a loop to do this, I stored all the ICS22CD codes of interest in a local variable, and cost amounts in a different local variable.

    The idea is that, in the below loop, where the ICS22CD code is QKK, the relevant population number would be multiplied by 29789.1377931, and where the ICS22CD code is QKM it would be multiplied by 5986.25689 etc There are 40 cost amounts - though some are set to 0, corresponding to the 40 localities. I am aware that the calculations below are doing more than multiplying - but you can see below the steps I need to carry out. I know the steps in the actual calculations are right because I have done the calculations manually in Stata, but would prefer to do this by a loop.

    However, I get the error message 'QKK not found'. when I run the loop - QKK is one of the codes for ICS22CD, so I think I must have mis-specified something in the coding as I am still getting used to writing loops.

    Can anyone advise on what the issue might

    local icd QKK QKM
    local cost 29789.1377931 5986.25689
    foreach a of local icd {
    foreach c of local cost
    {
    sum(RegPats) if ICS22CD == `a'
    di r(sum)
    local pop_sum = r(sum)
    local mult = `c'/`pop_sum'
    replace PFI2 = RegPats * `mult' if ICS22CD == `a'

    }
    }
    Last edited by Daniel Sutcliffe; 29 Nov 2021, 10:36.

  • #2
    There is no data example here to allow testing, but some comments are possible. I think your code simplifies to

    Code:
    local icd QKK QKM
    local cost 29789.1377931 5986.25689
    
    foreach a of local icd {
        sum RegPats if ICS22CD == `a', meanonly 
        foreach c of local cost {
            replace PFI2 = RegPats * `c'/r(sum) if ICS22CD == `a'
        }
    }
    where the meanonly option is a token speed-up.

    I guess the answer to your question is that QKK is for you a literal string, and not the name of a variable or scalar, which is what Stata expects from your syntax. If that is so, you need "`a'" not `a' in your qualifiers.

    In return I have a question. Your second replace just overwrites the results of the first, so that I guess you don't mean what your code implies.

    From what I understand of your question, I wouldn't write loops here at all. Your icd and cost values can go in two variables in a separate dataset; then the problem is to merge that with the main dataset and do one multiplication. You could do something similar with frames.

    There would need to be a previous step to get the totals in one variable, something like

    Code:
    egen totalcost = total(RegPats), by(ICS22CD)


    See also https://www.stata.com/statalist/arch.../msg01258.html for some comments on the expression "local variables", which isn't standard in Stata.
    Last edited by Nick Cox; 29 Nov 2021, 10:55.

    Comment


    • #3
      Thank you, although I don't think the code gives me what I was looking for - although this is probably due to not explaining myself well.

      This is the code as to how I would do the calculations outside of the loop, for two of the localities:

      cap drop PFI2
      qui sum(RegPats) if ICS22CD == "QKK"
      di r(sum)
      local pop_sum = r(sum)
      local mult = 399339.287 /`pop_sum'
      gen PFI2 = RegPats * `mult' if ICS22CD == "QKK"


      qui sum(RegPats) if ICS22CD == "QUE"
      di r(sum)
      local pop_sum = r(sum)
      local mult = 5986.25689/`pop_sum'
      replace PFI2 = RegPats * `mult' if ICS22CD == "QUE"


      When I run the following I can see the results are as I expect i.e. I get back to the original quantum.

      tabstat PFI2 , s(sum) by( ICS22CD)

      Is it possible to put the codes above, into a loop, so that I can specify which value relates to each locality. In the above, 5986.25689 relates to QUE only and 399339.287 relates to QKK only.

      In the below data, PFI2 relates to the output from the above codes.


      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str3 ICS22CD str6 gp_code float RegPats double(PFI PFI2)
      "QKK" "G85030"  6376 .0062548007263284805   1248.89383118905
      "QKK" "G85041"  7554  .007410408514222921 1479.6336262236644
      "QKK" "G85023" 12960    .0127136476495008  2538.529493759424
      "QKK" "G85642"  8916   .00874651870701768 1746.4142720956038
      "QKK" "G85035"  9200     .009025120245016 1802.0425418662578
      "QKK" "G83039"  9068   .00889562938932664 1776.1871488742638
      "QKK" "G85127"  6522   .00639802546065156 1277.4914628317101
      "QKK" "G85085" 12252   .01201910578716696  2399.850567711456
      "QKK" "G84041"  3927 .0038523529567584605  769.1979415118255
      "QKK" "G83651"  6457  .006334261024137861 1264.7596405250465
      "QKK" "G83642" 10179  .009985510758045421 1993.8033732235476
      "QKK" "G85045"  8686  .008520890700892281 1701.3632085489473
      "QKK" "G85632"  8035  .007882265344424301 1573.8491112929762
      "QKK" "Y06345"  1216 .0011928854584716802  238.1830142292793
      "QKK" "G85091"  6087  .005971294231675261 1192.2861904717295
      "QKK" "G85061"  7781  .007633093546355381 1524.0970672023209
      "QKK" "G85032" 11920    .0116934166652816  2334.820336852804
      "QKK" "G84609"  3787 .0037150141704212604  741.7755550051651
      "QKK" "G85096"  9801  .009614696034934981 1919.7629296555644
      "QKK" "G85021"  4541   .00445468163398018  889.4646937624649
      "QKK" "G84604"  7037  .006903235996106261  1378.366670338354
      "QKK" "G85137"  9705  .009520520867160901 1900.9590074795688
      "QKK" "G85130" 10934  .010726159212935321  2141.688386170181
      "QKK" "G85662"  7009   .00687576823883882 1372.8821930370218
      "QKK" "G83026"  7528  .007384902739617441  1474.540897300999
      "QKK" "G84007"  8589   .00842573454178722 1682.3634121836183
      "QKK" "G85651"  8426   .00826583295483748 1650.4359193222922
      "QKK" "G84004" 10139   .00994627110480622 1985.9684056502163
      "QKK" "G85104" 10345  .010148355318988101 2026.3184886528736
      "QKK" "G85121"  6487  .006363690764067261 1270.6358662050452
      "QKK" "G85724" 22406   .02198009176193788  4388.757086201671
      "QKK" "G85696" 14584   .01430677757101232 2856.6291772366853
      "QKK" "G85105"  7354  .007214210248026921 1440.4587883570066
      "QKK" "G84621"  3928 .0038533339480894403  769.3938157011588
      "QKK" "G85073"  7954  .007802805046614921 1557.9833019569799
      "QKK" "G83030"  5921  .005808449670732581 1159.7710750424035
      "QKK" "G84040" 10757   .01055252374735186 2107.0186546581886
      "QKK" "G85120"  7044 .0069101029354231205  1379.737789663687
      "QKK" "G84023" 11115  .010903718643842701  2177.141614439506
      "QKK" "G85025" 11652  .011430510988578961 2282.3260541114823
      "QKK" "G84002" 12367  .012131919790229661  2422.376099484784
      "QKK" "G85698" 14088  .013820205870846241  2759.475579327374
      "QKK" "G84039"  3381 .0033167316900433804  662.2506341358497
      "QKK" "G84008"  5845  .005733894329578101 1144.8846366530736
      "QKK" "G85052" 10454   .01025528337406492 2047.6687752902021
      "QKK" "G83033"  5619  .005512190288776621 1100.6170698637504
      "QKK" "G85019"  4030    .0039533950638494  789.3729830131542
      "QKK" "G84015" 11261  .011046943378165782  2205.739246082166
      "QKK" "G84028"  6772   .00664327329339656 1326.4600101650324
      "QKK" "G85089" 10417   .01021898669481866 2040.4214302848704
      "QKK" "Y06113"   359   .00035217588782182  70.31883397065072
      "QKK" "G84033"  8024   .00787147443978352 1571.6944952103102
      "QKK" "G83630" 12638   .01239776844092524  2475.458004794105
      "QKK" "G84001" 13398   .01314332185247004  2624.322388687405
      "QKK" "G85716"  5627   .00552003821942446 1102.1840633784166
      "QKK" "G83025" 11348  .011132289623961041 2222.7803005541623
      "QKK" "G85003"  6786   .00665700717203028 1329.2022488156986
      "QKK" "G83046"  5542 .0054366539562911605 1085.5347572850872
      "QKK" "G85028" 11901  .011674777829992981  2331.098727255471
      "QKK" "G84032" 12898  .012652826186980042 2526.3852940207603
      "QKK" "G83066" 20202   .01981798686845796 3957.0503729111024
      "QKK" "G83057" 10264  .010068895021178722 2010.4526793168773
      "QKK" "G85027"  6069   .00595363638771762 1188.7604550637304
      "QKK" "G83680" 24347  .023884195935370063  4768.948887697585
      "QKK" "G85011" 20725  .020331045334560503 4059.4925739324126
      "QKK" "G84025"  8393   .00823346024091514 1643.9720710742938
      "QKK" "G84624"  2747 .0026947831862020604  538.0663980985446
      "QKK" "Y02811" 12633  .012392863484270341 2474.4786338474387
      "QKK" "G85006"  8902   .00873278482838396 1743.6720334449378
      "QKK" "G83052" 17480    .0171477284655304   3423.88082954589
      "QKK" "Y07020"   381   .00037375769710338  74.62806613598308
      "QKK" "G84011"  8702   .00853658656218796   1704.49719557828
      "QKK" "G85647"  6766  .006637387345410681 1325.2847650290328
      "QKK" "G85711"  9181   .00900648140972738 1798.3209322689254
      "QKK" "G83010" 14020  .013753498460339601 2746.1561344527104
      "QKK" "G85736"  3684   .00361397206333032  721.6005135038363
      "QKK" "G85706"  6038  .005923225656457241 1182.6883551943984
      "QKK" "G83006" 14941  .014656991476172181 2926.5562628286693
      "QKK" "G85102"  7083  .006948361597331341 1387.3768830476854
      "QKK" "G84003" 11098   .01088704179121604 2173.8117532208403
      "QKK" "G84029"  4569  .004482149391247621  894.9491710637969
      "QKK" "G84630"  3214   .00315290613776972  629.5396445171905
      "QKK" "G83024"  4964   .00486964096698472  972.3194758504461
      "QKK" "G85685"  8836  .008668039400539281 1730.7443369489408
      "QKK" "G85038" 13965    .0136995439371357 2735.3830540393797
      "QKK" "G84035"  7810 .0076615422949538005 1529.7774186929862
      "QKK" "G83015"  7701 .0075546142398769806  1508.427132055658
      "QKK" "G84010" 14633  .014354846146230341 2866.2270125140167
      "QKK" "G84019" 10925    .0107173302909565  2139.925518466181
      "QKK" "G85076" 11171   .01095865415837758 2188.1105690421705
      "QKK" "G83027" 18554   .01820131315500292  3634.249708889842
      "QKK" "G83058"  9255    .0090790747682199 1812.8156222795888
      "QKK" "G83673"  5061   .00496479712608978  991.3192722157751
      "QKK" "G83062"  9452   .00927233006042296 1851.4028375782466
      "QKK" "G85623" 11743  .011519781199698141 2300.1506053408116
      "QKK" "G83053" 18810    .0184524469357338  3684.393501359164
      "QKK" "G85138" 15110  .014822779011107802 2959.6590008259955
      "QKK" "G83047"  9989   .00979912240515922 1956.5872772502228
      "QKK" "G84018"  8359   .00820010653566182 1637.3123486369618
      "QKK" "G83021" 17368  .017037857436460642 3401.9429203405616
      end
      ------------------ copy up to and including the previous line ------------------

      Comment


      • #4
        Thanks for your data example. I don't understand everything yet, but I see no reason to change my advice that loops are just a distraction here.

        As I understand it you have a main dataset with variables like this

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input str3 ICS22CD float RegPats
        "QKK"  6376
        "QKK"  7554
        "QKK" 12960
        "QKK"  8916
        "QKK"  9200
        "QKK"  9068
        "QKK"  6522
        "QKK" 12252
        "QKK"  3927
        "QKK"  6457
        "QKK" 10179
        "QKK"  8686
        "QKK"  8035
        "QKK"  1216
        "QKK"  6087
        "QKK"  7781
        "QKK" 11920
        "QKK"  3787
        "QKK"  9801
        "QKK"  4541
        "QKK"  7037
        "QKK"  9705
        "QKK" 10934
        "QKK"  7009
        "QKK"  7528
        "QKK"  8589
        "QKK"  8426
        "QKK" 10139
        "QKK" 10345
        "QKK"  6487
        "QKK" 22406
        "QKK" 14584
        "QKK"  7354
        "QKK"  3928
        "QKK"  7954
        "QKK"  5921
        "QKK" 10757
        "QKK"  7044
        "QKK" 11115
        "QKK" 11652
        "QKK" 12367
        "QKK" 14088
        "QKK"  3381
        "QKK"  5845
        "QKK" 10454
        "QKK"  5619
        "QKK"  4030
        "QKK" 11261
        "QKK"  6772
        "QKK" 10417
        "QKK"   359
        "QKK"  8024
        "QKK" 12638
        "QKK" 13398
        "QKK"  5627
        "QKK" 11348
        "QKK"  6786
        "QKK"  5542
        "QKK" 11901
        "QKK" 12898
        "QKK" 20202
        "QKK" 10264
        "QKK"  6069
        "QKK" 24347
        "QKK" 20725
        "QKK"  8393
        "QKK"  2747
        "QKK" 12633
        "QKK"  8902
        "QKK" 17480
        "QKK"   381
        "QKK"  8702
        "QKK"  6766
        "QKK"  9181
        "QKK" 14020
        "QKK"  3684
        "QKK"  6038
        "QKK" 14941
        "QKK"  7083
        "QKK" 11098
        "QKK"  4569
        "QKK"  3214
        "QKK"  4964
        "QKK"  8836
        "QKK" 13965
        "QKK"  7810
        "QKK"  7701
        "QKK" 14633
        "QKK" 10925
        "QKK" 11171
        "QKK" 18554
        "QKK"  9255
        "QKK"  5061
        "QKK"  9452
        "QKK" 11743
        "QKK" 18810
        "QKK" 15110
        "QKK"  9989
        "QKK"  8359
        "QKK" 17368
        end
        
        save ds_original_data
        Then you have a list of constants, which should best go in a dataset. For your data example only QKK is present so there is just one multiplier, 399339.87. (Oddly in #1 it was the utterly different 29789.1377931.)

        Code:
         
        clear
        
        input str3 ICS22CD double cost
        QKK  399339.87
        end
        So, you should merge that with your main dataset, calculate a variable with totals. and you're away.

        Code:
        merge 1:m ICS22CD using ds_original_data
        egen double total = total(RegPats), by(ICS22CD)
        
        gen double wanted = RegPats * cost / total
        
        list
        As said, with your approach you'd just be typing (lots of?) constants into not just one local macro, but two, only to have the bother of taking them out again. You should want them as data in a form that is easily stored and checked.

        Comment


        • #5
          Thank you, that works well and is a much more elegant solution. The costs were different in the above examples as I made these up as I didn't want to use "real" numbers for confidentiality reasons.

          Comment


          • #6
            That makes sense.

            Comment

            Working...
            X