Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Listing pairs of matched

    Dear Statalist users,

    I kindly need some help for dealing with the issue below:

    I am trying to list and compare numerous variables (more than 200) coming from different CRFs using loops.
    After successful programming steps, I have the following (abstract) dataset (NOV stands for NOVember data and APR for APRil data):

    -------------------------------------------------------------------------------------------------------
    storage display value
    variable name type format label variable label
    -------------------------------------------------------------------------------------------------------
    pid int %8.0g __SubjectKey
    crf str12 %12s __FormOID
    priorvteAPR byte %8.0g PRIORVTE
    fmhisvteAPR byte %8.0g FMHISVTE
    hiscancAPR byte %8.0g HISCANC
    priorvteNOV byte %8.0g PRIORVTE
    fmhisvteNOV byte %8.0g FMHISVTE
    hiscancNOV byte %8.0g HISCANC

    All I want to do is the following :
    (1) list matched pairs when values are equal
    (2) list matched pairs when values are NOT equal

    by matched (or desirable) pairs, I mean priorvteAPR vs. priorvteNOV etc , avoiding output with priorvteAPR vs. fmhisvteNOV

    The code I use for this:


    lookfor NOV
    local nov `r(varlist)' //need to be executed immediately after LOOKFOR


    lookfor APR
    local apr `r(varlist)' //need to be executed immediately after LOOKFOR



    //works FOR EQUAL but not what I need (gives every possible pair)
    foreach n of local nov{
    foreach a of local apr{
    cap nois list `n' `a' if `n'==`a' & `n'==`a'!=. in 1/7, abbr(20) noobs clean
    }
    }

    //works FOR EQUAL and it is what I need (gives with matched pairs only)
    foreach n of local nov{
    foreach a of local apr{
    cap nois list `n' `a' ///
    if `n'==`a' & `n'==`a'!=. & ///
    (substr("`n'", 2, length("`n'")-5)== ///
    substr("`a'", 2,length("`a'")-5)) in 1/12, abbr(22) noobs clean
    }
    }


    However, whatever I have tried to list matched pairs for when values are not equal between NOVember and APRil pairs, did not work!
    Any ideas would be much appreciated.

    Thank you
    George




  • #2
    For each matched pair, you have a common stub name and that's what you should loop over. Just in case some variable(s) end with "APR" or "NOV" but there's no matching pair, the following example builds a list of variable stubs that have both suffixes. You also seem to be concerned with missing values so the code will only list observations when both variables are not missing:

    Code:
    * create a demonstration dataset
    clear
    set seed 321
    set obs 20
    gen pid = _n
    foreach v in priorvte fmhisvte hiscanc {
        gen `v'APR = 1
        gen `v'NOV = runiformint(0,1) if runiform() < .8
    }
    gen single_NOV = runiformint(0,1)
    list
    
    * identify common variable stubs for variables with "APR" "NOV" suffix pairs
    ds *APR
    local stubsAPR = subinstr("`r(varlist)' ", "APR ", " ", .)
    ds *NOV
    local stubsNOV = subinstr("`r(varlist)' ", "NOV ", " ", .)
    local stubs : list stubsAPR & stubsNOV
    dis "`stubs'"
    
    * list observations where values are the same 
    foreach v of local stubs {
        dis _n(3) " ---------- `v'APR == `v'NOV -------------"
        list pid `v'APR `v'NOV if `v'APR == `v'NOV & !mi(`v'APR,`v'NOV), abbr(20) noobs clean
    }
    
    * list observations where values differ
    foreach v of local stubs {
        dis _n(3) " ---------- `v'APR != `v'NOV -------------"
        list pid `v'APR `v'NOV if `v'APR != `v'NOV  & !mi(`v'APR,`v'NOV), abbr(20) noobs clean
    }

    Comment


    • #3
      Dear Robert,

      This is great! It works very well. Many many thanks!!

      George

      Comment

      Working...
      X