Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to measure statistical difference of two ratios in STATA?


    Hello, STATA experts!



    I need your help. Right now I'm in the process of writing my research paper, however, I'm stuck with the following issue.



    I've conducted a survey an 'audit study' on the topic of racial discrimination on the job market. I've created 4 identical profiles (for the sake of the example let's name them A, B, C, and D) with the only difference in the names of the applicants. then I applied to a 1000 jobs with each application (so in total 4000 responses).



    what I want to do is to measure if the ratio A/A (my control group, least discriminated or no discrimination at all = 1) is significantly different from A/B, A/C and A/D (or vice versa), I need this in order to understand if there is a room for discrimination. thus, if the difference between the ratios is significant that should mean there is discrimination.

    I've attached the txt file with data in it. there are 4 data points for each profile (A,B,C,D)

    0 - means 'rejection'
    1 - means 'positive response'
    2 - means 'employer seen your application, but didn't respond'
    3 - means 'employer didn't see your application'



    I just can't figure out how to build a model in STATA so it'll be correct to analyze and draw conclusions.



    Thank you in advance!
    Attached Files
    Last edited by Mekhraly Shakhbazov; 22 Sep 2019, 16:31.

  • #2
    I don't know; maybe something along the following lines.

    .ÿ
    .ÿversionÿ16.0

    .ÿ
    .ÿclearÿ*

    .ÿ
    .ÿimportÿdelimitedÿdata.txt,ÿclear
    (4ÿvars,ÿ1,000ÿobs)

    .ÿrenameÿ*ÿsco*

    .ÿgenerateÿintÿrowÿ=ÿ_n

    .ÿquietlyÿreshapeÿlongÿsco,ÿi(row)ÿj(Job)ÿstring

    .ÿencodeÿJob,ÿgenerate(job)ÿlabel(Jobs)

    .ÿ
    .ÿmlogitÿscoÿi.job,ÿvce(clusterÿrow)ÿbaseoutcome(0)ÿnolog

    MultinomialÿlogisticÿregressionÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿNumberÿofÿobsÿÿÿÿÿ=ÿÿÿÿÿÿ4,000
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿWaldÿchi2(9)ÿÿÿÿÿÿ=ÿÿÿÿÿ331.55
    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0000
    Logÿpseudolikelihoodÿ=ÿ-5174.3944ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿPseudoÿR2ÿÿÿÿÿÿÿÿÿ=ÿÿÿÿÿ0.0081

    ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿ(Std.ÿErr.ÿadjustedÿforÿ1,000ÿclustersÿinÿrow)
    ------------------------------------------------------------------------------
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿRobust
    ÿÿÿÿÿÿÿÿÿscoÿ|ÿÿÿÿÿÿCoef.ÿÿÿStd.ÿErr.ÿÿÿÿÿÿzÿÿÿÿP>|z|ÿÿÿÿÿ[95%ÿConf.ÿInterval]
    -------------+----------------------------------------------------------------
    0ÿÿÿÿÿÿÿÿÿÿÿÿ|ÿÿ(baseÿoutcome)
    -------------+----------------------------------------------------------------
    1ÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿjobÿ|
    ÿÿÿÿÿÿÿÿÿÿdÿÿ|ÿÿ-.8724892ÿÿÿ.0948318ÿÿÿÿ-9.20ÿÿÿ0.000ÿÿÿÿ-1.058356ÿÿÿ-.6866224
    ÿÿÿÿÿÿÿÿÿÿgÿÿ|ÿÿ-1.157657ÿÿÿ.1222314ÿÿÿÿ-9.47ÿÿÿ0.000ÿÿÿÿ-1.397227ÿÿÿ-.9180884
    ÿÿÿÿÿÿÿÿÿÿiÿÿ|ÿÿ-.4195152ÿÿÿ.0771645ÿÿÿÿ-5.44ÿÿÿ0.000ÿÿÿÿ-.5707547ÿÿÿ-.2682756
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ.4907782ÿÿÿ.1109901ÿÿÿÿÿ4.42ÿÿÿ0.000ÿÿÿÿÿ.2732416ÿÿÿÿ.7083148
    -------------+----------------------------------------------------------------
    2ÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿjobÿ|
    ÿÿÿÿÿÿÿÿÿÿdÿÿ|ÿÿ-.5782829ÿÿÿ.0555678ÿÿÿ-10.41ÿÿÿ0.000ÿÿÿÿ-.6871938ÿÿÿ-.4693719
    ÿÿÿÿÿÿÿÿÿÿgÿÿ|ÿÿ-.8539179ÿÿÿ.0647556ÿÿÿ-13.19ÿÿÿ0.000ÿÿÿÿ-.9808365ÿÿÿ-.7269993
    ÿÿÿÿÿÿÿÿÿÿiÿÿ|ÿÿ-.3832549ÿÿÿ.0463412ÿÿÿÿ-8.27ÿÿÿ0.000ÿÿÿÿ-.4740819ÿÿÿ-.2924279
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿ1.288117ÿÿÿÿ.098735ÿÿÿÿ13.05ÿÿÿ0.000ÿÿÿÿÿÿÿ1.0946ÿÿÿÿ1.481634
    -------------+----------------------------------------------------------------
    3ÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿÿÿjobÿ|
    ÿÿÿÿÿÿÿÿÿÿdÿÿ|ÿÿ-.2303471ÿÿÿ.0630884ÿÿÿÿ-3.65ÿÿÿ0.000ÿÿÿÿÿ-.353998ÿÿÿ-.1066962
    ÿÿÿÿÿÿÿÿÿÿgÿÿ|ÿÿ-.6117984ÿÿÿ.0654913ÿÿÿÿ-9.34ÿÿÿ0.000ÿÿÿÿÿ-.740159ÿÿÿ-.4834379
    ÿÿÿÿÿÿÿÿÿÿiÿÿ|ÿÿ-.1899257ÿÿÿ.0500982ÿÿÿÿ-3.79ÿÿÿ0.000ÿÿÿÿ-.2881164ÿÿÿ-.0917349
    ÿÿÿÿÿÿÿÿÿÿÿÿÿ|
    ÿÿÿÿÿÿÿ_consÿ|ÿÿÿÿ.317759ÿÿÿ.1149014ÿÿÿÿÿ2.77ÿÿÿ0.006ÿÿÿÿÿ.0925564ÿÿÿÿ.5429617
    ------------------------------------------------------------------------------

    .ÿtestÿ[1]2.jobÿ[1]3.jobÿ[1]4.job

    ÿ(ÿ1)ÿÿ[1]2.jobÿ=ÿ0
    ÿ(ÿ2)ÿÿ[1]3.jobÿ=ÿ0
    ÿ(ÿ3)ÿÿ[1]4.jobÿ=ÿ0

    ÿÿÿÿÿÿÿÿÿÿÿchi2(ÿÿ3)ÿ=ÿÿ115.76
    ÿÿÿÿÿÿÿÿÿProbÿ>ÿchi2ÿ=ÿÿÿÿ0.0000

    .ÿ
    .ÿexit

    endÿofÿdo-file


    .


    The coefficients are log odds ratios and the ones that I've selected for comparison are of the rejection-to-acceptance scores for each job compared to the "a" job. Whatever, I don't think that you should be constructing ratios like you describe to compare.

    Comment


    • #3
      Dear Joseph Coveney,

      thank you very much for your response!

      could you please also write how to interpret these coefficients correctly?

      thanks in advance!

      Comment


      • #4
        Originally posted by Mekhraly Shakhbazov View Post
        could you please also write how to interpret these coefficients correctly?
        Actually, with multiple response options, they're more of relative risk ratios. I think your best bet is in the user's manual entry for the -mlogit- command, which you can get to from the help file in Stata. It's laid out pretty well there.

        Comment


        • #5
          People trained in epidemiology or health sciences may find the discussion of relative-risk ratios in the -mlogit- help confusing. This short note I wrote on it may help.
          --
          Bruce Weaver
          Email: [email protected]
          Web: http://sites.google.com/a/lakeheadu.ca/bweaver/
          Version: Stata/MP 18.0 (Windows)

          Comment


          • #6
            any other solutions to my issue, please?

            Comment


            • #7
              Originally posted by Mekhraly Shakhbazov View Post
              any other solutions to my issue, please?
              I think that you need to think about the research problem and its measurement more. For example, you don't even define what "A/A", "A/B" etc. is. It's not even clear that you can obtain anything that can be construed to measure "room for discrimination" from what your dataset records.

              Comment


              • #8
                Originally posted by Joseph Coveney View Post
                I think that you need to think about the research problem and its measurement more. For example, you don't even define what "A/A", "A/B" etc. is. It's not even clear that you can obtain anything that can be construed to measure "room for discrimination" from what your dataset records.
                well A/A is a ratio of positive responses to job applications for the person A (who is not discriminated, because they are the majority lets say white American) which is obvious equals to 1, thus no discrimination.

                next is B/A or A/B (both are possible) which is the ratio of positive responses for the second person (lets say Afro-American) who is discriminated, to the number of positive number of answers to person A, which will obviously be different from 1 and so on with other people.

                so what I want to know if that difference is significant or not , which may tell us if there is discrimination or not.

                I hope this is clear.

                if you have any other questions please ask

                Comment

                Working...
                X