Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Dividing variable into 2 equal groups

    Hello,
    I'm testing the effect of wcsrindex on wtob and I want to apply the Difference in means test. So I have to divide the observations into 2 groups:

    Group 1 having wcsrindex >= median of wcsrindex
    Group 2 having wcsrindex < median of wcsrindex

    After that I apply the t test using the Stata command :
    ttest wtob, by(group)

    The problem here that I got 2 groups that do not have the equal number of observations.

    Here is the output stata :

    Click image for larger version

Name:	Capture d’écran 2018-05-29 à 2.12.38 AM.png
Views:	1
Size:	53.1 KB
ID:	1446348


    Can somebody help me to generate 2 groups which group 1 has the value of wcsrindex >= median of wcsrindex (0,7621875) and group 2 the value of wcsrindex < median of wcsrindex (0,7621875) with an equal observations' number. Please !


  • #2
    First, how do you know that 0.7621875 is the median of wcsrindex? You don't show the code that helped you arrive at that.

    Assuming it is the median, there is a deeper problem, and what you want to do is usually not possible. If there are observations that all have wcsrindex = 0.7621875, then when you use that cutpoint to define 2 groups, all of those values must go into the same group, which implies that the two groups will not be of equal size. In short, the only way that a median split of this type results in two equal size groups is if the median is the mid-point between two observations in the center of the sort order. Then you get two equal groups. But if there are any observations that actually equal the median, then they all go in one group and that group will necessarily be larger than the other.

    All of that said, this is not a good approach to testing the effect of wcsrindex on wtob in any case. Turning a continuous variable into dichotomy just throws away information. Think of it this way. You are saying that an observation with wcsrindex = 0.7621875 is equivalent to another observation with wcsrindex = 1000 but is radically different from one with wcsrindex = 0.76. That's clearly nonsense.

    If you want to estimate the effect of wcsrindex on wtob, use a correlation or regression model.

    Comment


    • #3
      .... but start with

      Code:
      scatter wtob wcsrindex

      Comment

      Working...
      X