Text Statistics Research

High school
Book Report/Review
Book Reports
Pages 20 (5020 words)
Download 0
The purpose of this paper is to analyze the word length in articles from two different newspapers using statistics. My hypothesis is that the average word length in both articles is approximately the same. It should be proved true or false by my research.

Introduction

new drinking laws in the UK. The article from "The Sun" is titled "Sup all night". The article from "Daily Mail" is titled "Police braced for the great British binge" (see references for more detail).
The research consists of the following steps. First, I select a sample of 100 words from each article. I count the word length and frequency of the same length words putting it into the summary table and analyze the findings. Then I do the same procedure for 200 words and 400 words. The reason why I decided to split my analysis into those 3 consecutive steps is in order to see any possible changes in my statistical indicators (such as mean, median, mode). On average, they should not volatile drastically for each article when moving to a larger size sample. But they should become more accurate as in a larger-size samples random differences should smooth out.
As was noted above, for each step sample size I calculate mean, median and mode. The mean shows me what the average word length in the sample is by merely dividing the total number of letters in the sample by the total number of words. So it can be any decimal number, like 4.53. It doesn't tell me the exact number of letters in the word (as there are no words with 4.53 letters), but it gives a good estimation of distribution of letters across the words.
However, the mean could yield a bit misleading results if the data distribution is skewed to the left or right. ...
Download paper
Not exactly what you need?

Related papers

Text Statistics Research
new drinking laws in the UK. The article from "The Sun" is titled "Sup all night". The article from "Daily Mail" is titled "Police braced for the great British binge" (see references for more detail).…
Anita Pachecos Royalism and honor in Aphra Behns Oroonoko
The review begins with a clearly stated objective of exploring the main points of Pacheco's essay and analysing its strengths and weaknesses; however, proceeds to describe Pacheco's arguments and illustrations that suggest Behn's hero as 'Eurocentric' and 'royalist.' A significant portion of the review is spent on summarising Pacheco's viewpoints, and presents little attempt to analyse or…
Vladimir Nabokovs Lolita
bert's crafty rhetoric, through its "tantalising allusions to a variety of genres," precludes his story's inevitable end, "misdirecting any readerly desire for closure", to avoid any final moral conclusion. [Tweedie, 2] According to him Nabokov's novel "occupies a place on the literary map akin to those cartographic idiosyncrasies," allowing the author and reader "to wander into different forms,…
Darrell Huff's How to Lie With Statistics
Huff then explicates how the reader can see through the smoke and to get to what really lies behind the mirror.…
Mathematics Text Book
This is the primary feature discussion shown in his book Student Activities Manual to accompany Mathematics for Elementary Teachers: A Contemporary Approach.…
Tim May "Social Research: Issues, Methods and Process"
Therefore, it is hard to overestimate the importance of the proper introductory literature to social research for sociology and social policy undergraduates which would help to establish a clear understanding of the scope of available and prospective tools of sociological research. Let us overview two works that may pretend to assume the role of such literature.…
Critically evaluate the given research article
Moreover, the enormous emphasis, in recent times, placed on evidence based healthcare further necessitate the need for HSC practitioners to develop their own sense of critical evaluation of available data/information, as this empowers them to make better analysis and derive judgement from evidences upon which practice is based. Along this line, Polit et al (2001) and Hek (1996), argue that…