Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Matching observations in a quasi-random process

    Dear Statalisters,

    I want to create a set of observations which is used to test different models later on. Lets say I have three Variables y and x1 and x2 with 1000 obervations each. The observations should be matched in a quasi-random process following a dependency structure between the Variables x1/x2 and y. This matching should have a stochastic component.
    For example, I divide the observations in quintiles based on their values and create a matching-matrix with a negative dependency:
    matrix match=
    (0.025,0.05,0.1,0.2,0.625\
    0.05,0.1,0.2,0.45,0.2\
    0.1,0.2,0.4,0.2,0.1\
    0.2,0.45,0.2,0.1,0.05\
    0.625,0.2,0.1,0.05,0.025)

    As a result of this process I would have a data set where 62.5% of the values of quintile 1 of y are correctly matched with quintile 1
    5 of x1, 20% are falsly matched with quintile 4 and so on. Here, for quintile 3 the correct matching is only 40%.

    Is such a matching process possible or are there other ways to accomplish this matching.
    Prior to this I have generated y, which follows a mixed distribution and x1/x2 which are not normal distributed.


    Kind regards
    Steffen

  • #2
    You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Also, simplify your problem as much as possible - it is hard to follow your explanation.

    One way to sample in Stata is to generate a random number, sort on that random number, and then identify the first whatever observations as the selected observations. I wonder if you could do this repeatedly to get your desired outcomes.

    Comment

    Working...
    X