Any data contains two term variables and the frequencies. Variables refer to the quantities that may change after an interval, while frequencies are their speed of occurring. Ensuring stability in variance is essential for optimizing errors and further difference. In this blog, we will study measures of dispersion in data. Show What is the Measure of Dispersion?Every data consists of some variability within its range. Variability in data is defined as how far apart the data lie from each other and from the center of distribution. It is also known as the shatter, spread or dispersion of data. Dispersion of data is defined as the degree to which the arithmetical data approached to spread an average value. Measure of dispersion helps in calculating the variability of data. Like the central tendency of data, variability is also essential for summarizing the characteristics of data. It helps in stating the facts and figures of the data. The mathematical concept of variability has been thoroughly applied to science studies. The physics and chemistry branch of science don’t hold much variability as found in medicines and biology. Also Read | What is Sampling Distribution? Types of VariabilityIn any data, there can occur three kinds of variability, biological, real and experimental. Here is a brief introduction to three of them. As the name suggests, biological variability is related to the human body and medications. In the same environment and test conditions two individuals can respond differently when compared to each other. These variations may arise due to difference in sex class, weight etc. Such variations are also known as biological variability. Real variability is also known as variability within the limits. The variability is termed as real when the difference between two readings or observations is more than the defined limits of the universe. Experimental variability occurs during the time of experiments. It may arise because of error or difference during methods, procedure or any other defects during the techniques. Experimental variability is further classified into three types: observer error, instrumental error, and sampling error.
Error also occurs when the sample is not the true representative of the population, and thus it won’t enable us to draw the true conclusions. Also Read | What is Statistics? Ways to Calculate Measures of DispersionTypes of measure of dispersion Below we’ve disclosed 5 different ways of calculating measures of dispersion. These are :
Range is the difference between the highest and lowest value of the sample. The coefficient of range is defined as the relative measure of the range. Mathematically, range (R) = H-L, where H is the higher limit and L is the lower limit. and, the coefficient of range is defined as H-L / H+L. For consideration, consider the data related to weekly production of a fabric producing industry.
Table 1 For the above data, the range is H-L = 13.2-8.3 = 4.9. and, coefficient of range is defined as H-L / H+L = 4.9/21.5 = 0.228 The advantage of range is that it is easy to calculate and understand. Range calculation is helpful in calculating statistical quality control and weather forecasting. The disadvantage of range is that it is not worthy for thorough analysis as it gets affected by the extreme value of sample distribution. Quartile deviation is also known as interquartile range. It is defined as the range of a group of observations, it is calculated by processing the value of the upper quartile and lower quartile of the particular group. For a group, the upper quartile is defined as the value above which 25% of the observations lie. Lower quartile is vice versa, it is the value below which 25% of the observation lies. For consideration, here is the class data of a school,
Table 2 For the given table, the upper quartile is 64 and the lower quartile is 60, so the quartile range is 64-60= 4. The advantage of quartile deviation is that it is easy to calculate and remains unaffected by the extreme values. Moreover, quartile deviation is more beneficial when the observer has to deal with the half of the group only. The disadvantage of quartile deviation is that it avoids only 50% of the extreme value and is not suitable for algebraic treatment. Also Read | Analysis of Variance (ANOVA) Mean deviation is defined as the average or mean of the deviations of the value from central tendency. Central tendency can be any measurement mean, median or mode. The mean deviation is calculated in following steps,
For consideration, here is a demo data set.
Table 3 Mean deviation = sum of differences of mean and observation/ total number of observations = 36/6 = 6 The coefficient of mean deviation is defined as the ratio of mean deviation to arithmetic mean, 6/21= 0.285. The advantage of mean deviation is that it can be easily calculated by any method of central tendency, the observations are volatile by extreme items, and is based on the measurement not on estimation. The drawback of mean deviation is that it ignores the original sign of observation while calculation, thus, not suitable for accurate and deeper analysis. Standard deviation is defined as the square root of arithmetic mean of the squared deviation of observation taken from the average observation. The steps for the calculation of standard deviation are as follows:
For consideration, for table 3 data, standard deviation, square root of (362 / 6) which is 14.64 here. Standard deviation is used for calculating the large set of data. It helps in calculating the errors and differences in the data set. It is used for identifying the suitable size of the data. Variance is defined as the square of the standard deviation. Variance is helpful for drawing the inferences of the statistics. The coefficient of variance is used for the comparison of variability of one character in two different variable groups. Coefficient of variation is calculated from standard deviation and the arithmetic mean of the observation. For consideration, for the data of table 3, variance is calculated as (14.694)2 = 215. 913 Variance and Coefficients of variance are versatile in the data series which have the same units but different standard deviations. It helps in comparison and representation of series with different units. The only disadvantage is that variance is unitless. What is Correlation?Another term related to variables and statistics is correlation. Correlation is defined as a relationship between the two variables of a data study. In other words, when in a set of observations, two variables are inter-related in such a manner that change in one variable affects the other variable of the entry too is called correlation. Correlation helps to study the linear association or relativity between two quantitative parameters. Types of CorrelationThere are primarily three categories of Correlation. Here is the brief introduction to three of them: If the change in variation directly affects the other variation, it is defined as a positive correlation. For example, height and weight. And, when the change in variation is inverse it is called as negative correlation. When in a data set, there is no relation between two variables it is called zero correlation. When the change in value of a variable changes constantly with the change in another variable, it is known as linear correlation. But, when this change is not constant with the other change, the correlation is termed as non-linear. The simple study with two variables is called simple correlation. When this study is done keeping the other variables constant, the process is called partial correlation. And, when study is conducted fluctuating the other variables, the study is termed as multiple correlation. Also Read | What is Vital Statistics? Methods of measuring a CorrelationCorrelation study can be measured in following ways:
Also Read | Statistical Quality Control Statistics is an important subject not for the theoretical classes, but for the different practical applications in life. For example, statistics helps in data management of bigger firms and organizations. From calculating errors in manufacturing to maintaining the raw supply and other stocks, dispersion has a huge role in businesses and organizations. |