seeing for help to analyse the association between DV and continuous IVs that violate normality and heteroscedasticity

Emerald Chang

Join Date: Sep 2017

Posts: 50
#1

seeing for help to analyse the association between DV and continuous IVs that violate normality and heteroscedasticity

02 Aug 2018, 21:11

Hello everyone ,

I think I have difficulty checking for the normality of my variables by plotting histograms. Histograms seem very subjective when it is used as a method to check normality of variables . As slightly either right or left skewed data (not so obvious ones) can be still seen as normally distributed despite the fact that some outliers are seen either at the right or left side of the histogram itself.

Anyway, I was hoping to find the association between my dependent variable (continuous) and rest of variables of interest (continuous) by using linear regression analysis, but just realised that both my DV and, those continuous IVs that had been previously included in my model , all ended up failing these two tests - Skewness/Kurtosis tests for Normality and Breusch-Pagan / Cook-Weisberg test for heteroskedasticity . (Failing to check skewness of data by eyeballing.) But, while I used the command pnorm to check it, the result seems FINE. I would say.

Image of histogram of z_MI (dependent variable)

Image of pnorm of z_MI (dependent variable)

Here are my Stata outputs of Skewness/Kurtosis tests and Breusch-Pagan / Cook-Weisberg tests:

*Results of Breusch-Pagan / Cook-Weisberg test for heteroskedasticity are attached to this post below.

Apparently, z_MI is my outcome but as we decided to covert myo-inositol (MI) into MI in Z-score for the purpose of interoperation as, we have no information about in what unit, MI was previously being measured in the lab. Was trying to apply transformation on z_MI, but neither log z_MI nor log10 z_MI passed the tests. so after spending the whole day on googling yesterday, I came across some potential methods that possibly, still, enable me to examine the association between my DV and continuous IVs but without using linear regression.

As mentioned earlier, neither DV nor all continuous IVs included in my model meet the assumptions (normality and heteroskedasticity) of linear regression. There are some other variables of interest are categorical were included in the model as well.

Here were the commands that I came across yesterday and I simply ran a univariate regression between these two variables

(1) reg z_MI ogtt_2hour , robust (used Stata version 14.1)

(2) rreg z_MI ogtt_2hour (used Stata version 14.1)

May I know what is the difference between ",robust" and "rreg" as to me, they are all about robust regression? Thanks

(3) npregress kernel z_MI ogtt_2hour, vce(bootstraps ,reps(100) seed (123) (Stata version 15)

with regard to (3), I have no idea what is the standard # I should put for both reps ( ) and seed ( ) as I just followed the syntax from the Stata website blindly.

Other than the three syntax I used above, any other better suggestion to analyse data happen to be non-normal ? Personally, don't like to transform data as it makes interpretation harder.

Thank you for putting up with this long winded post and I truly appreciate your time and effort in contributing to this discussion.

Many thanks,
Emerald

Attached Files

Last edited by Emerald Chang; 02 Aug 2018, 21:17.
Tags: None
Phil Bromiley

Join Date: Apr 2014

Posts: 4348
#2

03 Aug 2018, 11:36

You didn't get a quick answer. You'll increase your chances of a useful answer by following the FAQ on asking questions - provide Stata code in code delimiters, readable Stata output, and sample data using dataex. Also, don't post a pile of junk - shorten your posting to the absolute minimum necessary to explain your problem. Send your cute pictures to your friends or a cute picture site. You've posted 31 previous times on this forum - you should know how to post productively by now.

Regression does not assume normality of the iv's. In small samples, some of the statistical tests may need normally distributed errors (although I read an earlier posting on this forum by Silva as indicating this is not generally a problem).
Comment

Announcement

seeing for help to analyse the association between DV and continuous IVs that violate normality and heteroscedasticity

Comment