Interpreting odds ratio in logit models

Meng Yu

Join Date: Feb 2018

Posts: 169
#46

06 Jul 2021, 11:01

Mystery solved. I made a mistake in assigning the score to province after changing the category arrangement. Though the program is still running, I think that was where things went wrong. Sorry I couldn’t work on this earlier as I didn’t have access to the data.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#47

06 Jul 2021, 14:34

Glad you found your problem, and thanks for the update.
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#48

29 Dec 2021, 22:15

It is, in part, for this reason that such models allow the individual waves to "borrow strength" from each other.

I have a question regarding the relationship between sample size and standard error. I understand larger sample size leads to smaller standard error because larger sample size is expected to more accurately represent reality. What I don't understand is if the Central Limit Theorem has an assumption. I guess I don't understand why the mean of the random observations, when sampled infinite times, will be infinitely close to the population mean.

Last edited by Meng Yu; 29 Dec 2021, 22:24.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#49

30 Dec 2021, 10:26

Well, I'm not sure I actually "understand" it either. I mean, I do understand why the standard error goes to zero as sample size goes to infinity, and why the limit of the expected value of the sample mean in a sample of any size is the population mean. That's a fairly simple calculation. But the Central Limit Theorem is more profound than that: it tells us that the sampling distribution of the sample mean approaches a normal (Gaussian) distribution. I have seen two different proofs of the central limit theorem, and they are not intuitive--they do not really promote an understanding of why the theorem is true. They are based on technical calculations involving characteristic functions of distributions.

If you want to get an intuitive feel for the central limit theorem in action, I think the best way to do it is to run some simulations. Set up a large data set to serve as a "population," with a variable sampled from some distribution--something that is highly non-normal. Then repeatedly draw random samples of the same size--start with a small size like 5, and then plot a histogram of the sample means. Then do the same with sample size 10, and 25, and 50, and keep going until you get bored. You will see the distributions of the sampling mean become more and more normal in appearance, with the mean approaching the mean of the original population.
1 like
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#50

30 Dec 2021, 16:37

why the standard error goes to zero as sample size goes to infinity

Thank you for your reply. I guess it is because S.E. = S.D. /square root of n.
I have another question on why when we calculate population variance, the denominator in σ² =Σ(x_i- Xbar)²/n-1 is "n-1", but when we calculate the variance of the sample, the denominator is "n". I guess it is an arbitrary design. It is not meant to reflect the "real" population variance, but to indicate how many respondents it took to achieve the figure. Am I right?

But the Central Limit Theorem is more profound than that

I actually believe in the theorem although I have never read the mathematical proof. I just think it needs some assumption when we apply it to the human population to study their action. The assumption might be humans are socially constrained or at least influenced. That means the majority of the human population behave in the same way. That is why when we draw samples from them, we may get some samples with their mean deviating far from the population mean, but the majority will cluster around the population mean. That is why there is the normal distribution of the sample means.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#51

30 Dec 2021, 18:03

I have another question on why when we calculate population variance, the denominator in σ² =Σ(x_i- Xbar)²/n-1 is "n-1", but when we calculate the variance of the sample, the denominator is "n". I guess it is an arbitrary design.

Actually it's not arbitrary. I don't want to work through the calculations here because typing equations, especially with sums, exponents, and subscripts is tedious and error prone. But, suffice it to say that if you take the limit as n goes to infinity of the formula with n in the denominator for the standard deviation (not the variance), it does not converge on the population standard deviation. It comes in a bit too small. The use of n-1 in the denominator is exactly what is needed for the limit as n approaches infinity of the sample standard deviation to equal the population standard deviation. That's where that comes from.

I just think it needs some assumption when we apply it to the human population to study their action.

Well, the assumption that is required for the Central Limit Theorem is that the population variance be finite. (So, for example, it does not apply if the population distribution is Student t with 1 df, or Cauchy, to give a couple of easy examples). This is a purely statistical condition; it has nothing to do whether the distribution in question is measured on humans or insects or stores, or is an abstract distribution. That the population distribution of anything measurable about the human population will satisfy this condition of having finite variance is automatic, because N is finite.

Last edited by Clyde Schechter; 30 Dec 2021, 18:06.
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#52

30 Dec 2021, 22:27

They are based on technical calculations involving characteristic functions of distributions.

I could be wrong but I guess because it is possible to calculate each respondent's number of choices through permutation and combination, it is possible to prove the Central Limit Theorem.

The use of n-1 in the denominator is exactly what is needed for the limit as n approaches infinity of the sample standard deviation to equal the population standard deviation.

I sense this has something to do with calculus, but I don't remember the concepts of limit and converge very well. I guess I just don't quite get why degree of freedom is a part of the story. Some book explains it as the number of x_i that can take any value in calculating S.D. Since the sample mean is set, you only have n-1 x_i that can take any value. But I don't understand why the sample mean is already determined.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#53

30 Dec 2021, 23:21

Some book explains it as the number of x_i that can take any value in calculating S.D. Since the sample mean is set, you only have n-1 x_i that can take any value. But I don't understand why the sample mean is already determined.

That's not quite right. It's not that the sample mean is set. It's that the formula for the SD involves the sample mean as one of the inputs, and then you add up a bunch of (x_i-xbar)² terms. Now, given that xbar, the sample mean, is by definition the sum of all the x_i divided by n, there are not n independent (x_i-xbar)² terms: only n-1 are independent, and the last one can be calculated directly from that. So that's why the degrees of freedom of the sample standard deviation is n-1. That is not the reason that n-1 appears in the formula for sample standard deviation, however. The appearance of n-1 in the formula has to do with the fact that the formula with n does not converge to the population standard deviation in the limit, and the n-1 formula does. There may be a connection between these two roles of n-1, but if there is, I've either forgotten it or am not aware of it. But as I understand it, they degrees of freedom and the denominator of the standard deviation formula are separate issues that just both happen to involve n-1.
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#54

23 Jan 2022, 22:14

When you run a single model where wave interacts with all of the model variables, then the sample size is the complete sample for all three waves combined. It is, in part, for this reason that such models allow the individual waves to "borrow strength" from each other.

Sorry to bother again, I actually never really understood what you meant by "borrow strength." Did you mean making the p-value smaller? But you said it is not about statistical significance, so I am confused.
Thank you again for your patience.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#55

24 Jan 2022, 14:53

The larger sample size makes the analysis more precise--that is, standard errors of estimates generally are smaller. Also, it permits the relationships among variables in a given wave to influence the estimates of the coefficients in other waves as well. This leads to a "regularization" of the results: extreme coefficients due to fluky data are less likely to result, outliers become less influential. So in both of these ways, the results are more credible. The effect on p-values could be in either direction. The standard errors will in general be smaller as a result of the larger sample size (all else being equal)--and that would lead to smaller p-values. But the coefficients themselves may also be smaller, particularly because "false positive" large effects get smoothed out a bit. That alone would make p-values smaller. The net impact on the p-value could be in either direction, or no impact at all depending on how these two factors balance each other in a particular situation.
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#56

24 Jan 2022, 20:09

Thank you. I don't understand why when in both situations the p-value would becomes smaller, the net impact on the p-value could be in either direction.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#57

24 Jan 2022, 20:43

Sorry, misspoke. Where for the second situation I say "That alone would make p-values smaller." I should have said "That alone would make p-values larger."
Comment
Meng Yu

Join Date: Feb 2018

Posts: 169
#58

24 Jan 2022, 21:10

So "false positive" large effects can lead to smaller p-value? Is it because large effects in theory bring the coefficient closer to the population parameter?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30121
#59

25 Jan 2022, 09:17

In any regression, your coefficient is an estimate of some real-world association between variables. In general, the coefficient is not equal to the actual strength of the real-world association--there is error in the estimation due to sampling variation in the data (not to mention systematic error due to measurement issues, biased sampling, model misspecification, or other systematic problems.) With most of the familiar regressions, and with large enough samples, the estimation error in the coefficient has an approximately normal distribution. The p-value derives from a test statistic, which is normally something like or closely related to the coefficient divided by its standard error. The standard error also has sampling variability, typically more complex than just a normal distribution, and, again for the most common estimators, the sampling error of the standard error is independent of the sampling error of the coefficient. So, all else equal, those samples that give larger coefficient estimates nevertheless have the same distribution of estimates of the standard error as other samples. So the larger size of the coefficient leads to a larger test statistic, which leads to smaller p-values.

Is it because large effects in theory bring the coefficient closer to the population parameter?

No, not at all! The discrepancy between the population parameter and the sample estimate depends on the systematic errors and the sampling variation. The systematic errors are a matter of study design and have nothing to do with the size of the population parameter, and the sampling error is just a function of population-level variation and sample size.
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment