Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • computed and store chi2

    Hello stata users,

    Thank you for taking the time to address my concern. (and sorry for my english..).I've been looking for two hours and I don't know how to do.

    I want to check out , in my descriptives statistics if there's a difference between two samples (let say users and non users )

    Because my DV and IV are categorical I cannot use p-value significance. I first used used this command : 2*ttail(e(df_r),abs(_b[VARIABLE1]/_se[Variable1] in a loop and obtain a matrix of descriptives statistics with significant differences , before I realized it wasn't appropriate.

    my question is how do I computed chi2 statistic instead of pvalue (2*ttail(e(df_r),abs(_b[VARIABLE1]/_se[Variable1]) in stata ? how to do it by hand like the pvalue calculation?
    Is there a relationship with pvalue and chi2 so I can use the pvalue I have, to calculate chi2?


    Thank you very much for your help!

  • #2
    Let's start with your variables being categorical. That and your title imply to me an interest in chi-square tests. Where does the t distribution enter at all?

    Comment


    • #3
      You seem to be restricting the term p-value to only refer to a Wald test in a linear regression. The term is actually much more general, e.g. you can get a p-value from a chi-squared test:

      Code:
      . sysuse nlsw88, clear
      (NLSW, 1988 extract)
      
      . tab race collgrad, chi2
      
                 |   college graduate
            race | not colle  college g |     Total
      -----------+----------------------+----------
           white |     1,217        420 |     1,637
           black |       480        103 |       583
           other |        17          9 |        26
      -----------+----------------------+----------
           Total |     1,714        532 |     2,246
      
                Pearson chi2(2) =  16.9189   Pr = 0.000
      
      . return list
      
      scalars:
                        r(N) =  2246
                        r(r) =  3
                        r(c) =  2
                     r(chi2) =  16.9189164522441
                        r(p) =  .0002118868343348
      The key question you first need to ask yourself is whether this test tests a hypothesis you care about. So what is the question you hope to answer with these tests? What are the variables? Does that correspond with what the chi-squared test does? I obviously don't know you, so I am having a hard time figuring out what you already know, and what not.
      ---------------------------------
      Maarten L. Buis
      University of Konstanz
      Department of history and sociology
      box 40
      78457 Konstanz
      Germany
      http://www.maartenbuis.nl
      ---------------------------------

      Comment


      • #4
        thank you for taking the time to answer me.
        Excuse me, I'm sorry, I misspoke. I wasn't clear enough.
        here is an example of the visual of the table I would like to build and the last statistic (chi2) I would like to have.
        I first obtained it but i computed t-stat instead of chi2

        except that all of my variables are dummies that take the value 1 or 0.

        I have seen in several references that the appropriate statistic is rather a chi2. and not a t-value.

        my question is: is there a formula by hand to compute the chi2 as the t-value ( in red)


        here's part of my code so that you can better see (sorry if it's not the appropriate format), maybe i shouldh have use dataex...)
        *************
        local xvars `othervars' `educvar' 'incomevar'

        local z: word count `xvars'

        matrix T = J(`z', 6, .)
        matrix rownames T = `xvars'
        matrix colnames T = n var1 sdC no_var1 sdT pv



        foreach var in `xvars' {
        global `var'label: var label `var'

        cap drop sample_`var'

        reg `var' var1 [pw=wgt]
        g sample_`var'=(e(sample)==1)
        local pr = 2*ttail(e(df_r),abs(_b[var1]/_se[var1]))
        mat T[rownumb(T, "`var'"), colnumb(T,"pv")]=`pr'

        su `var' if sample_`var'==1, d
        mat T[rownumb(T, "`var'"), colnumb(T,"n")]=`r(N)'

        su `var' if sample_`var'==1 & var1==1
        mat T[rownumb(T, "`var'"), colnumb(T,"users")]=`r(mean)'
        mat T[rownumb(T, "`var'"), colnumb(T,"sdC")]=`r(sd)'

        su `var' if sample_`var'==1 & var1==0
        mat T[rownumb(T, "`var'"), colnumb(T,"non_users")]=`r(mean)'
        mat T[rownumb(T, "`var'"), colnumb(T,"sdT")]=`r(sd)'

        }

        *convert to dataset
        clear
        svmat T
        rename T1 N
        rename T2 users
        rename T3 sdC
        rename T4 non_users
        rename T5 sdT
        rename T6 pv

        gen Description=""
        gen var=""
        order Description

        local i=1
        foreach var in `xvars' {
        replace Description ="$`var'label" in `i'
        replace var="`var'" in `i'
        local i= `i' + 1
        }

        order var

        lab var var "Variables"
        lab var Description "Descriptions"
        lab var N "Observations"
        lab var users"users"
        lab var sdC "SD"
        lab var no_users "No users"
        lab var sdT "SD"
        lab var pv "P-value"

        foreach var of varlist N {
        replace `var'=round(`var',1)
        format `var' %9.0f
        }

        foreach var of varlist users sdC no_user sdT {
        replace `var' = round(`var',0.01)
        format `var' %9.2f
        }

        foreach var of varlist pv {
        replace `var' = round(`var',0.001)
        format `var' %9.3f
        }


        gen stars=""
        replace stars="*" if pv<=0.10
        replace stars="**" if pv<=0.05
        replace stars="***" if pv<=0.01


        ******
        Click image for larger version

Name:	Example.png
Views:	1
Size:	51.0 KB
ID:	1534001

        Last edited by abonga manuel; 28 Jan 2020, 06:39.

        Comment


        • #5
          To create your table, forget about regression. It is a descriptive table; there are not controls, and (at least at this stage of your argument) there should be no controls. All you are doing is describing your data. So, no regression. The chi-squared values you asked for are the chi-squared values I gave you in #3.
          ---------------------------------
          Maarten L. Buis
          University of Konstanz
          Department of history and sociology
          box 40
          78457 Konstanz
          Germany
          http://www.maartenbuis.nl
          ---------------------------------

          Comment


          • #6
            I see, thank you very much for your help!

            Comment

            Working...
            X