So I now come to the last assignment for this week, assignment 8.
At first, I just used the 150 countries I have been working with so far. But then I thought that there is no reason not to use all of the data available for this, so I run the Pearson correlation with the full Gapminder Data Set. As we learned in the lesson the Pearson correlation is used to examine the linear relationship between two quantitative variables. I examined my four quantitative variables armedforcesrate, co2emissions, femaleemployrate and urbanrate with regard to the suiciderate:
As you can see, there is only one possibly linear relationship:
Among 169 countries of the Gapminder Data Set (my sample), the correlation between the female employment rate (quantitative explanatory variable) and the suicideper100TH rate (quantitative response variable) was 0.15 (p=.0509), suggesting that only 2.25% (i.e. 0.15 squared) of the variance in the suiciderate of the sample can be explained by the female employmentrate.
So there are almost no linear relationships in my sample. A look at the scatter plots and the graphs in assignment 5 suggest that there might be non-linear relationships but at this point I do not know how to explore them…
Also I am thinking of putting my variables into different categories as is explained in this blog:
But I think I shall wait with this until the last week when we are supposed to do our final assignment…