To test the hypothesis that there is a positive correlation between the height and the weight and "the taller a person the more they weigh", I used scatter plots and cumulative frequency diagrams. The trends and positive slope of the best fit line support the hypothesis. For verification, I have used two additional sets of data, one for people in the 7th grade and one for the 11th grade. Although more cluttered, the data for these two additional sets also supports the hypothesis.
I could use the entire table, but it is so large that I could very quickly make mistakes. So, in order to begin, I chose to collect a random sample of data. I decided to collect samples of size of 30 for boys and girls because it is large enough to ensure that the sample is representative of the population. At the same time, sample size of 30 is easy to work with and would allow me to avoid mistakes in calculations.
To assure randomness of the sample, I used Excel function RAN ( ) to assign each entry row a random number from 0 to 1. Then I sorted the rows according to their random numbers and chose the first 30 entries. The obtained samples are in the table below:
But this is not enough. I don't see any trends or anything in this list. So I need to put it in a chart that will tell me how many people fall into each group. ...