Describe

To start, choose stats>Describe.

../_images/describe1.png
  • Stats Summary: Choose the column of data to get the statistical summary.

  • Percentiles: The default percentiles to calculate can be changed in the input box. The numbers are in percentage. For example, 100 is 100%. Separate the numbers with a comma, while space is optional.

The sample output is:

../_images/describe2.png

The statistical items on the top table are locked by default, while the percentiles on the lower one depend on the input. Here are the explanations of the statistical items:

  • N: The total number of valid inputted numbers.

  • Mean: The average of all valid inputted numbers.

  • SD: The standard deviation of all valid inputs. Both standard deviation and variance in the software is considering when the data is a sample from a population. The formula for standard deviation is \(\sigma = \sqrt{\frac{\sum_{i=1}^{n} (x_i - \mu)^2}{n - 1}}\).

  • SE Mean: The standard error of the mean quantifies the variability of sample means around the true population mean. \(\text{SEM} = \frac{\sigma}{\sqrt{n}}\)

  • Variance: The variance of the inputted numbers, which is the square of the standard deviation. The standard deviation is always the square root of the variance. The formula for variance is \(\sigma^2 = \frac{\sum_{i=1}^{n} (x_i - \mu)^2}{n - 1}\). The relationship between standard deviation and variance is \(\sigma = \sqrt{\sigma^2}\).

  • CoefVar: The coefficient of variation (CV) is a statistical measure that represents the ratio of the standard deviation to the mean. The formula for coefficient of variation is \(CV = \left(\frac{\sigma}{\mu}\right) \times 100\).

  • Sum: The total sum of all valid inputted numbers.

  • Min: The minimum of all valid inputted numbers.

  • Q1: Q1, or the First Quartile, is a measure of statistical dispersion that indicates the value below which 25% of the data falls. It essentially marks the 25th percentile of a dataset.

  • Median: The median is a measure of central tendency that represents the middle value in a dataset when it is arranged in ascending order. It effectively divides the dataset into two equal halves, with 50% of the data points lying below and 50% above.

  • Q3: Q3, also known as the upper quartile, is the value that separates the highest 25% of data points from the lower 75% in an ordered dataset. It marks the 75th percentile of the data distribution.

  • Max: The maximum of all valid inputted numbers.

  • Range: The difference between Max and Min.

  • IQR: The Interquartile Range (IQR) is the difference between the third quartile (Q3) and the first quartile (Q1) of a dataset. It represents the middle 50% of the data.

  • Mode: In statistics, the mode is the value that appears most frequently in a dataset. For example, in the set {2, 3, 3, 4, 5, 5, 5, 6}, the mode is 5.

  • N for Mode: The number of times that Mode appears in the dataset.

  • Skewness: Skewness measures the extent to which a distribution deviates from symmetry around its mean. WIKI LINK

  • Kurtosis: Kurtosis is a statistical measure that describes the shape of a probability distribution, specifically focusing on its “tailedness”. WIKI LINK

  • Percentiles: A percentile is a value below which a certain percentage of observations fall in a dataset. WIKI LINK