PolarSPARC |
Introduction to Matplotlib - Part 2
Bhaskar S | 09/24/2017 |
Hands-on Matplotlib
Bar Plot
Let us switch gears to explore some bar plots in Matplotlib.
To make a bar plot of the x and y values, use the bar() method as shown below:
plt.bar(x, y, color='b')
plt.title('Subject vs Grade of a Student', color='#6b0eb2', fontsize='16', fontweight='bold')
plt.xlabel('Subjects', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xlim(xmin=0, xmax=6)
plt.ylim(ymin=60, ymax=100)
subjects = ['Biology', 'Chemistry', 'English', 'Math', 'Physics']
plt.xticks(x, subjects, rotation=45)
plt.yticks(range(60, 100, 5))
plt.show()
The plot should look similar to the one shown in Figure.13 below:
The bar() method generates a simple bar graph. It is called with two parameters, both of type array (or list). The first parameter represents the x-axis value. The left side of the bar represents the x-axis value. The second parameter represents the y-axis value. The height of the bar represents the y-axis value.
To display grades for two students in colors blue and cyan, execute the following methods as shown below:
plt.bar(x, y, color='b', label='Alice')
plt.bar(x, y2, color='c', label='Bob')
plt.title('Subject vs Grade for 2 Students', color='#6b0eb2', fontsize='16', fontweight='bold')
plt.xlabel('Subjects', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xlim(xmin=0, xmax=6)
plt.ylim(ymin=60, ymax=100)
subjects = ['Biology', 'Chemistry', 'English', 'Math', 'Physics']
plt.xticks(x, subjects, rotation=45)
plt.yticks(range(60, 100, 5))
plt.legend(loc=2)
plt.show()
The plot should look similar to the one shown in Figure.14 below:
By default, bar plots have no borders and stack on top of one another. Also they use opaque colors meaning, if one of the bars is higher than the other, it will overshadow the other. This is evident from Figure.14 above, where the Math scores for Bob overshadow the Math scores for Alice.
To change the transparency of the bar plots so that both Alice's and Bob's scores are visible, execute the following methods as shown below:
plt.bar(x, y, color='b', label='Alice', alpha=0.8)
plt.bar(x, y2, color='c', label='Bob', alpha=0.8)
plt.title('Subject vs Grade for 2 Students', color='#6b0eb2', fontsize='16', fontweight='bold')
plt.xlabel('Subjects', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xlim(xmin=0, xmax=6)
plt.ylim(ymin=60, ymax=100)
subjects = ['Biology', 'Chemistry', 'English', 'Math', 'Physics']
plt.xticks(x, subjects, rotation=45)
plt.yticks(range(60, 100, 5))
plt.legend(loc=2)
plt.show()
The plot should look similar to the one shown in Figure.15 below:
The bar() method takes a parameter called alpha that controls the transparency of a bar plot.
To add a border around each bar in the plot, execute the following methods as shown below:
plt.bar(x, y, color='b', label='Alice', alpha=0.8, edgecolor='k')
plt.bar(x, y2, color='c', label='Bob', alpha=0.8, edgecolor='k')
plt.title('Subject vs Grade for 2 Students', color='#6b0eb2', fontsize='16', fontweight='bold')
plt.xlabel('Subjects', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xlim(xmin=0, xmax=6)
plt.ylim(ymin=60, ymax=100)
subjects = ['Biology', 'Chemistry', 'English', 'Math', 'Physics']
plt.xticks(x, subjects, rotation=45)
plt.yticks(range(60, 100, 5))
plt.legend(loc=2)
plt.show()
The plot should look similar to the one shown in Figure.16 below:
The bar() method takes a parameter called edgecolor that controls the color of the border around a bar plot.
To adjust the thickness of each bar in the plot, execute the following methods as shown below:
width = 0.3
plt.bar(x, y, color='b', width=width, label='Alice', alpha=0.8, edgecolor='k')
plt.bar(x, y2, color='c', width=width, label='Bob', alpha=0.8, edgecolor='k')
plt.title('Subject vs Grade for 2 Students', color='#6b0eb2', fontsize='16', fontweight='bold')
plt.xlabel('Subjects', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xlim(xmin=0, xmax=6)
plt.ylim(ymin=60, ymax=100)
subjects = ['Biology', 'Chemistry', 'English', 'Math', 'Physics']
plt.xticks(x, subjects, rotation=45)
plt.yticks(range(60, 100, 5))
plt.legend(loc=2)
plt.show()
The plot should look similar to the one shown in Figure.17 below:
The bar() method takes a parameter called width that controls the width of a bar plot.
To display corresponding bars from each dataset next to each other (side-by-side), execute the following methods as shown below:
width = 0.3
plt.bar(x, y, color='b', width=width, label='Alice', edgecolor='k')
plt.bar(x+width, y2, color='c', width=width, label='Bob', edgecolor='k')
plt.title('Subject vs Grade for 2 Students', color='#6b0eb2', fontsize='16', fontweight='bold')
plt.xlabel('Subjects', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xlim(xmin=0, xmax=6)
plt.ylim(ymin=60, ymax=100)
subjects = ['Biology', 'Chemistry', 'English', 'Math', 'Physics']
plt.xticks(x+width/2, subjects, rotation=45)
plt.yticks(range(60, 100, 5))
plt.legend(loc=2)
plt.show()
The plot should look similar to the one shown in Figure.18 below:
To display bars next to each other side-by-side, add the width to the x-axis values of the second bar plot. Notice the adjustment to the xticks as well.
Box Plot
Next, shifting gears, let us explore some box plots in Matplotlib.
Let us generate a random sample of 100 grades (1 to 100). To do that, we need to initialize the random number generator by invoking seed() method (for reproducibility) as shown below:
np.random.seed(50)
Now, generate a random sample of 100 grades in the range 1 through 100 (with replacement), invoke the choice() method as shown below:
grades = np.random.choice(range(1, 101), 100, replace=True)
To make a box plot using the sample grades, use the boxplot() method as shown below:
plt.boxplot(grades)
plt.title('Grades of Students', color='#b2160e', fontsize='16', fontweight='bold')
plt.xlabel('Distribution', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xticks([])
plt.show()
The plot should look similar to the one shown in Figure.19 below:
The boxplot() method generates a simple box plot graph. It is called with a sample distribution of numerical data. The box plot is a way of displaying the distribution of the sample data based on quartiles: minimum, first quartile, median, third quartile, and maximum.
By default, the boxplot() method uses a thin line with a white background and a red line for the median. Note that a box plot uses a horizontal line at the ends to display the caps (minimum and maximum), a box to represent the quratile range (the bottom of the box representing the first quartile, the top of the box representing the third quartile, and a red line in between representing the median), and whiskers connecting the box to the caps.
To cutomize a box plot to use thicker lines and specific fill color, use the boxplot() method as shown below:
plt.boxplot(grades, patch_artist=True, capprops=dict(color='r', linewidth=2), boxprops=dict(facecolor='c', color='k', linewidth=2), medianprops=dict(color='r', linewidth=2), whiskerprops=dict(color='k', linewidth=2))
plt.title('Grades of Students', color='#b2160e', fontsize='16', fontweight='bold')
plt.xlabel('Distribution', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.xticks([])
plt.show()
The plot should look similar to the one shown in Figure.20 below:
The boxplot() method takes few parameters. The first and most important parameter is patch_artist, which is set to True.
The parameter capprops is used to customize the cap lines. It takes a dictionary of key-value pairs - we use it to customize the line color and thickness.
The parameter boxprops is used to customize the quartile box. It takes a dictionary of key-value pairs - we use it to customize the line thickness and fill color.
The parameter medianprops is used to customize the median line within the quartile box. It takes a dictionary of key-value pairs - we use it to customize the line colorand thickness.
The parameter whiskerprops is used to customize the whisker lines. It takes a dictionary of key-value pairs - we use it to customize the line thickness.
Histogram
Now, let us move on to explore some histogram plots in Matplotlib.
To make a histogram using the sample grades, use the hist() method as shown below:
plt.hist(grades)
plt.title('Grades of Students', color='#b2160e', fontsize='16', fontweight='bold')
plt.xlabel('Distribution', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.show()
The plot should look similar to the one shown in Figure.21 below:
The hist() method generates a simple histogram graph. It is called with a sample distribution of numerical or categorical data. The histogram is a way of displaying the frequency distribution of the sample data using a set of intervals (bins).
By default, histograms have no borders and use 10 bins, as is evident from Figure.21 above.
To add a border around each frequency bar, change the fill color and change the number of bins, execute the following methods as shown below:
plt.hist(grades, bins=20, facecolor='g', edgecolor='k')
plt.title('Grades of Students', color='#b2160e', fontsize='16', fontweight='bold')
plt.xlabel('Distribution', fontsize='14', fontweight='bold')
plt.ylabel('Grades', fontsize='14', fontweight='bold')
plt.show()
The plot should look similar to the one shown in Figure.22 below:
References