Data Visualisation
--
Part-3
Bar Graphs are commonly used to represent data. Lets see how to plot attractive and insightful bar graphs using matplotlib.
1. Plot a simple bar graph with two arrays.
plt.bar([0,1,2,3],[12,23,21,8])
You can see the bars are centered around 0,1,2 and 3 x-coordinate.
Now let's see how can we plot graphs to compare data of two years. Suppose you are given production of grain in 2018 and 2019 for 3 months.
plt.bar([0,1,2],[34,46,23]) #Production in 2018plt.bar([0,1,2],[23,45,43]) #Production in 2019
OOPS! Data of 2018 and 2019 overlapped. This happened because both are centered around 0,1, and 2.
So we will change the position on the x-axis for one of them. Let’s use and array pos_x=np.array([0,1,2]) to define position on x-axis for one of them. And for other we will add 0.5.
pos_x=np.array([0,1,2])plt.bar(pos_x,[34,46,23]) #Production in 2018plt.bar(pos_x+0.25,[23,45,43]) #Production in 2019
It still overlaps. Now we can either reduce the thickness or increase the gap in pos_x values. Let’s try to reduce the width.
pos_x=np.array([0,1,2])plt.bar(pos_x,[34,46,23], width=0.5) #Production in 2018plt.bar(pos_x+0.5,[23,45,43], width=0.5) #Production in 2019
Great, No Overlapping now! Still, it would look good if there is some space in between data of different months.
pos_x=np.array([0,1,2])*2plt.bar(pos_x,[34,46,23], width=0.5) #Production in 2018plt.bar(pos_x+0.5,[23,45,43], width=0.5) #Production in 2019
Looks good now.
2. Give Labels
We will now give labels to each color and will plot a legend.
pos_x=np.array([0,1,2])*2plt.bar(pos_x,[34,46,23], width=0.5, label=”2018") #Production in 2018plt.bar(pos_x+0.5,[23,45,43], width=0.5, label=”2019") #Production in 2019plt.legend()
3. Put labels on x-axis
To make it easy to interpret we will give the names of months for which production is compared. We do it using the tick_labels parameter.
pos_x=np.array([0,1,2])*2plt.bar(pos_x,[34,46,23], width=0.5, label=”2018", tick_label=[“Jan”, “Feb”, “Mar”]) #Production in 2018plt.bar(pos_x+0.5,[23,45,43], width=0.5, label=”2019") #Production in 2019plt.legend()
4. Give a title to the plot and both the axes
pos_x=np.array([0,1,2])*2plt.bar(pos_x,[34,46,23], width=0.5, label=”2018", tick_label=[“Jan”, “Feb”, “Mar”]) #Production in 2018plt.bar(pos_x+0.5,[23,45,43], width=0.5, label=”2019") #Production in 2019plt.title(“Wheat Production”)plt.xlabel(“Months”)plt.ylabel(“Production in Kgs”)plt.legend()
Now let's try to change the theme to make it look more attractive.
Above your existing code just write this one line.
plt.style.use(“dark_background”)
5. Change the color of bars
plt.style.use(“dark_background”)pos_x=np.array([0,1,2])*2plt.bar(pos_x,[34,46,23], width=0.5, label=”2018", tick_label=[“Jan”, “Feb”, “Mar”], color=”blue”) #Production in 2018plt.bar(pos_x+0.5,[23,45,43], width=0.5, label=”2019", color=”orange”) #Production in 2019plt.title(“Wheat Production”)plt.xlabel(“Months”)plt.ylabel(“Production in Kgs”)plt.legend()