This paper studies the use of two of the most common statistical measures used in information technology research viz. measures of central tendencies and variability.
Statistics enables the researcher in viewing the collected data in two ways. Descriptive statistics describes the shape of the data. Frequency and distribution are forms of descriptive statistics that help in this. Descriptive statistics uses measures such as mean, median, mode, correlation ,covariance etc.This data may be a sample or population data and we may have population mean compared to sample means etc. Inferential statistics attempts to fit a model to collected data and establishes causality .Inferential statistics also deals to develop predictive models which are based on causality analysis. In this paper mainly simple concepts of descriptive statistics are explored and inferential statistics is not touched upon. Statistical measures, not having real existence, simply support an argument or hypothesis and are just mental constructs. While statistics helps in summary organization of data, interpretation of the same, on its way to hypothesis, is the primary task of the researcher.
Comparative cost of ownership analysis of Server Operating Systems was done with elaborate use of mean analysis and t test significance (Cahner, 1997). Mean and standard deviation model, multivariate model, Markov process model and time series model were used as part of statistical technique in developing Misuse Detection Systems (Christina, 1997).Statistical user profiles were used as part of multilayered security system (Steve, 1999).A combination of arithmetic mean, median and standard deviation gave sufficient support to help conclude on Survey results on Operating systems'(David, 1998).
A basic primer of descriptive statistics is necessary not only for understanding such concepts but also for pointing to their specific use on research data. "The most frequently used average is the Mean, which is the balance point in a distribution. Its computation is simple - just add up the scores and divide by the number of scores.Formally mean is the value around which the deviations sum to zero.The formal definition also explains as to why informally one defines the mean as the balance point in a distribution. At mean value the positive and negative deviations balance each other out. A major drawback of the mean is that it moves in the direction of extreme scores. If in any two distributions most values are about same size however in one distribution one or two values are inordinately high then the mean of such a distribution would be pulled up greatly in comparison to the other distribution. This is a skewed distribution. For such skewed distributions, a different average, the Median, which is defined as the middle score is used. To get an approximate median, scores are put in order from low to high and count is made till middle score, which is set as median. The Mode is simply the score with highest frequency. The mode is sometimes used in informal