Charts And Graphs
Stem and Leaf Plot
The stem is used to group the scores and each leaf indicates the individual scores within each group. Example:
Histogram
The histogram is a graphical display of frequency distribution of data. The easiest method for construction if the histogram is using Pivot tables in Excel. Histogram tells me about the shape of the distribution.
Histogram can also be constructed with help of 2k Rule
The shape of the Histogram can be Symmetric (Normal), Positively skewed, negatively skewed and Bimodal.
Symmetric (Normal): If it is bell-shaped I can say data is normally distributed. For a normal distribution, mean is the best measure of central tendency.
Positively skewed: If histogram has a tail toward the right, it is said to be skewed to the right. A positively skewed data implies that there are very few observations with high values. Here, mean is greater than median which is greater than the mode. For a skewed data, the median is the best measure of central tendency.
Negatively skewed: If histogram has a tail toward the left, it is said to be skewed to the left. A negatively skewed data implies that there are very few observations with low values. Mean is less than median which is less than the mode. For a skewed data, the median is the best measure of central tendency.
Bimodal: Here 2 modes can be observed.
Box Plot
Box-plot indicates if there are any outliers in the dataset. Any point outside the box is considered as an outliers. The lower line of the box is 1st Quartile, the middle line is the median and upper line is 3rd Quartile. Box Plot is also a measure of Symmetry.
Box Plot is also a measure of Symmetry. It can tell us about the shape of underlying distribution.
Normal Distribution: If the line is close to the center of the box and the whisker lengths are the same then the sample is from symmetric (Normal) population.
Positively skewed: If the top whisker is much longer than the bottom whisker and the line is gravitating towards the bottom of the box, then the sample is from a population which is skewed to the right.Here, mean is greater than median which is greater than the mode. For a skewed data, the median is the best measure of central tendency.
Negatively skewed: If the bottom whisker is much longer than the top whisker and the line is rising to the top of the box, then the sample is from the population which is skewed to the left. Here, mean is less than median which is less than the mode. Here, mean is less than median which is less than the mode. For a skewed data, the median is the best measure of central tendency.
PP Plot and QQ Plot
PP plot indicates whether data follows a normal distribution. If its graph is S-shaped, data is normally distributed. Else if data is not normally distributed. It plots the corresponding areas under the curve (cumulative distribution function).
QQ plot indicates whether data is skewed to right or left. Here the actual values of X are plotted against the theoretical values of X under the normal distribution. The use of Q–Q plots is to compare the distribution of a sample to a theoretical distribution, standard normal distribution.
Scatterplot
Scatterplot tells me strength and direction of the linear relationship between two variables.