Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    Wow yes apologies! I cannot tell you how sure I was that the code syntax was correct that is a touch poor from my end; yes, making that adjustment has solved things thank you! I imagine this code (much like the brace pairing code) should work for any data set size in terms of firms and years?

    Also - purely as a learning exercise - I have tried to adjust the triplet code into a quad pairing code to see if my logic and understanding worked. Unfortunately - no. However, the error I get is "double not allowed" after coding drop double shuffle in line 50 (all code below). I then tried with triple shuffle in case it was to do with the fact that there was already a triplet formed, however the "variable triple was not found", as of course it had not been defined I just figured it worth a try to see if it was the next stage above a double shuffle as it were. Unsure where the logic in my code falls for quad matching? N.B. The fourth stock exchange code here is 37.

    // SEPARATE INTO FOUR DATA SETS
    preserve
    keep if StockExchangeCode == 11
    rename (GlobalCompanyKey AssetsTotal) =_11 // SUFFIX IN RENAME MUST MATCH STOCK EXCHANGE
    drop StockExchangeCode
    tempfile SE11 // DECLARATION IN TEMPFILE MUST MATCH SUBSEQUENT USE OF THE FILE
    save `SE11' // N.B. NO .dta

    restore, preserve
    keep if StockExchangeCode == 120
    rename (GlobalCompanyKey AssetsTotal) =_120
    drop StockExchangeCode
    tempfile SE120
    save `SE120'

    restore, preserve
    keep if StockExchangeCode == 90
    rename (GlobalCompanyKey AssetsTotal) =_90
    drop StockExchangeCode
    tempfile SE90
    save `SE90'

    restore
    keep if StockExchangeCode == 37
    rename (GlobalCompanyKey AssetsTotal) =_37
    drop StockExchangeCode
    tempfile SE37
    save `SE37'

    // COMBINE POTENTIAL MATCHES & SELECT BEST 4, BREAKING TIES AT RANDOM
    use `SE120', clear
    joinby NAICS DataYearFiscal using `SE11.dta'
    gen double shuffle = runiform()
    gen delta = AssetsTotal_120/AssetsTotal_11
    keep if inrange(delta, 0.75, 1.25)
    replace delta = abs(log(delta))

    joinby NAICS DataYearFiscal using `SE90' // gives error when .dta
    drop shuffle
    gen double shuffle = runiform()
    gen delta1 = AssetsTotal_90/AssetsTotal_120
    gen delta2 = AssetsTotal_90/AssetsTotal_11
    keep if inrange(delta1, 0.75, 1.25) & inrange(delta2, 0.75, 1.25)
    replace delta1 = abs(log(delta1))
    replace delta2 = abs(log(delta2))
    gen delta3 = max(delta1, delta2)

    use `SE37', clear
    joinby NAICS DataYearFiscal using `SE90' // gives error when .dta
    drop double shuffle
    gen doubele shuffle = runiform()
    gen delta4 = AssetsTotal_37/AssetsTotal_90
    gen delta5 = AssetsTotal_37/AssetsTotal_120
    keep if inrange (delta3, 0.75, 1.25) & inrange(delta4, 0.75, 1.25) & inrange(delta5, 0.75, 1.25)
    replace delta4 = abs(log(delta4))
    replace delta5 = abs(log(delta5))
    gen delta6 = max(delta3, delta4, delta5)


    // MATCHING WITHOUT REPLACEMENT
    local allocation_ratio 1
    local current 1

    sort GlobalCompanyKey_120 DataYearFiscal (delta shuffle)
    while `current' < _N {
    local end_current = `current' + `allocation_ratio' - 1
    while GlobalCompanyKey_120[`end_current'] != GlobalCompanyKey_120[`current'] ///
    & DataYearFiscal[`end_current'] != DataYearFiscal[`current'] {
    local end_current = `end_current' - 1
    }
    // KEEP REQUIRED # OF MATCHES FOR THE CURRENT CASE
    drop if GlobalCompanyKey_120 == GlobalCompanyKey_120[`current'] & DataYearFiscal == DataYearFiscal[`current'] in `=`end_current'+1'/L
    // REMOVE THE SELECTED MATCHES FROM FURTHER CONSIDERATION
    forvalues i = 0/`=`allocation_ratio'-1' {
    drop if GlobalCompanyKey_37 == GlobalCompanyKey_37[`current'+`i'] & DataYearFiscal == DataYearFiscal[`current' + `i'] & _n > `end_current'
    }
    local current = `end_current' + 1
    }

    Comment


    • #32
      The -drop- command does not care about the storage type of the variable(s) being dropped. So when you write -drop double shuffle- it thinks you have a variable named double that you want to drop. But, of course, you don't. And hence the error message you got. So, it's just -drop shuffle-.
      By the way, watch out for the line after that: here you do need the word -double- because a -float- is not long enough for the quantity of distinct random numbers you need to create. But you misspelled it as doubele. So fix that.

      Now, at a higher level of the logic, the code you show here tries to join all four stock exchanges together and then select matches. That will almost certainly explode your memory problem. Also, even if it doesn't, the code you use for selecting the matches without replacement only checks for re-use on stock exchange 37, not on stock exchanges 11 and 90.

      Now, one could modify that code to do it this way. But it will be very cumbersome code, and hard to read and understand. The better way to do this is to iterate one stock exchange at a time.

      You can create the four data sets and store them as tempfiles. That's fine. Then get matched pairs from exchanges 11 and 120. Then join that with exchange 90 and select the matched triplets along the lines of #28. Once you have those triplets, you can join in exchange 37 and select from those to form quadruples. The code for selecting the fourth member of the quadruple from exchange 37 is exactly like that for selecting from exchange 90 to make the triples, except 90 is replaced by 37 everywhere in that "paragraph."

      Comment


      • #33
        Yes, that seems to have worked thank you for the guidance in getting there! That should be (I hope) the end of my queries regarding matching, thank you for all your help!

        Comment

        Working...
        X