*STATA17*
For my Thesis, I am trying to explain what the effects of distance are on multiple different dependent variables which include Stress in the past 4 weeks. This data is in the format of percentual averages (so for example neighbourhood 1, "24% reported feeling a lot of stress in the past 4 weeks".
I only have 48 neighbourhoods and a wide range of other control variables which are also procentual (see below).
using the pwcorr command with the stress in past 4 weeks & average_distance of park results in a significant negative association. After that, I looked into whether there was a linear relationship between the variables using a scatterplot which was not the case in my opinion (see image below)
pwcorr Having_stress_last_4_weeks Distance_to_park, star(0.05) obs
twoway (scatter Having_stress_last_4_weeks Distance_to_park) (lfit Having_stress_last_4_weeks Distance_to_park)
Due to there also being an outlier as seen in the scatterplot, I decided to also check the correlation using both spearman & Ktau
spearman Having_stress_last_4_weeks Distance_to_park, stats(rho p)
ktau Having_stress_last_4_weeks Distance_to_park, stats(taua taub p)
These both showed insignificant outcomes. (see image below)
My exact question now is what to do from here. As it is unclear to me due to the nature of the dependent variable being a percentage, is there a non-parametric regression available that suits the data well even though both Spearman & Tau are insignificant?
An additional problem with the data is that most other variables such as "Male_gender" is also percentual and thus almost fully correlate with "Female_Gender" and to a certain extent it is the same for "Education_Level" being 3 separate percentages per neighbourhood (Low, average & high). Could someone give insight whether this works properly or that I should omit one of the genders / education levels?
I hope that I worded my example well, the same for the examples I have given below as png images.
All variables: "Distance to park", "Percentage feeling stressed", "Percentage feeling lonely", Percentage with low education", "percentage with average education", Percentage with high education, "Percentage of people working" "Income in absolute numbers" "Percentage Male gender" "Percentage female gender" "Percentage age groups (3 in total)"




For my Thesis, I am trying to explain what the effects of distance are on multiple different dependent variables which include Stress in the past 4 weeks. This data is in the format of percentual averages (so for example neighbourhood 1, "24% reported feeling a lot of stress in the past 4 weeks".
I only have 48 neighbourhoods and a wide range of other control variables which are also procentual (see below).
using the pwcorr command with the stress in past 4 weeks & average_distance of park results in a significant negative association. After that, I looked into whether there was a linear relationship between the variables using a scatterplot which was not the case in my opinion (see image below)
pwcorr Having_stress_last_4_weeks Distance_to_park, star(0.05) obs
twoway (scatter Having_stress_last_4_weeks Distance_to_park) (lfit Having_stress_last_4_weeks Distance_to_park)
Due to there also being an outlier as seen in the scatterplot, I decided to also check the correlation using both spearman & Ktau
spearman Having_stress_last_4_weeks Distance_to_park, stats(rho p)
ktau Having_stress_last_4_weeks Distance_to_park, stats(taua taub p)
These both showed insignificant outcomes. (see image below)
My exact question now is what to do from here. As it is unclear to me due to the nature of the dependent variable being a percentage, is there a non-parametric regression available that suits the data well even though both Spearman & Tau are insignificant?
An additional problem with the data is that most other variables such as "Male_gender" is also percentual and thus almost fully correlate with "Female_Gender" and to a certain extent it is the same for "Education_Level" being 3 separate percentages per neighbourhood (Low, average & high). Could someone give insight whether this works properly or that I should omit one of the genders / education levels?
I hope that I worded my example well, the same for the examples I have given below as png images.
All variables: "Distance to park", "Percentage feeling stressed", "Percentage feeling lonely", Percentage with low education", "percentage with average education", Percentage with high education, "Percentage of people working" "Income in absolute numbers" "Percentage Male gender" "Percentage female gender" "Percentage age groups (3 in total)"
Comment