47 of 100: Stacked bar chart in matplotlib
At the beginning of the year I challenged myself to create all 100 visualizations using python and matplotlib from the 1 dataset,100 visualizations project and I am sharing with you the code for all the visualizations.
Note: Data Viz Project is copyright Ferdio and available under a Creative Commons Attribution – Non Commercial – No Derivatives 4.0 International license. I asked Ferdio and they told me they used a Design tool to create all the plots.
Collaborate
There are a ton of improvements that can be made on the code, so let me know in the comments any improvements you make and I will update the post accordingly!
This is the original viz that we are trying to recreate in matplotlib:

Import the packages
We will need the following packages:
import matplotlib.pyplot as plt
import pandas as pd
Generate the data
We could actually go from numpy to matplotlib, but most data projects use pandas to transform the data, so I am using a pandas dataframe as the starting point.
color_dict = {(2022,"Norway"): "#9194A3", (2004,"Norway"): "#2B314D",
(2022,"Denmark"): "#E2AFA5", (2004,"Denmark"): "#A54836",
(2022,"Sweden"): "#C4D6F8", (2004,"Sweden"): "#5375D4",
}
xy_ticklabel_color, xlabel_color, grand_totals_color, grid_color, datalabels_color ='#C8C9C9',"#101628","#101628", "#C8C9C9", "#2B314D"
data = {
"year": [2004, 2022, 2004, 2022, 2004, 2022],
"countries" : [ "Denmark", "Denmark", "Norway", "Norway","Sweden", "Sweden",],
"sites": [4,10,5,8,13,15]
}
df= pd.DataFrame(data)
index | year | countries | sites |
---|---|---|---|
0 | 2004 | Sweden | 13 |
1 | 2022 | Sweden | 15 |
2 | 2004 | Denmark | 4 |
3 | 2022 | Denmark | 10 |
4 | 2004 | Norway | 5 |
5 | 2022 | Norway | 8 |
We need to create the subtotals for each year, the year labels and then sort the data.
df['year_lbl'] ="'"+df['year'].astype(str).str[-2:].astype(str)
df['sub_total'] = df.groupby('countries')['sites'].transform('sum')
#custom sort
sort_order_dict = {"Denmark":1, "Sweden":3, "Norway":2, 2004:5, 2022:4}
df = df.sort_values(by=['year','countries',], key=lambda x: x.map(sort_order_dict))
#Add the color based on the color dictionary
df['color'] = df.set_index(['year', 'countries']).index.map(color_dict.get)
index | year | countries | sites | year_lbl | sub_total | color |
---|---|---|---|---|---|---|
1 | 2022 | Denmark | 10 | ’22 | 14 | #E2AFA5 |
3 | 2022 | Norway | 8 | ’22 | 13 | #9194A3 |
5 | 2022 | Sweden | 15 | ’22 | 28 | #C4D6F8 |
0 | 2004 | Denmark | 4 | ’04 | 14 | #A54836 |
2 | 2004 | Norway | 5 | ’04 | 13 | #2B314D |
4 | 2004 | Sweden | 13 | ’04 | 28 | #5375D4 |
Define the variables
countries = df.countries.unique()
years = df.year.unique()
x = len(df.countries.unique())
codes = df.year_lbl
colors = df.color
Plot the chart
fig, ax = plt.subplots(figsize=(5,5),facecolor = "#FFFFFF")
fig.tight_layout(pad=3.0)
for year in zip(years,):
y = df[df["year"] == year]["sites"].values
ax.bar(countries, y,width =0.5)
for bar, color, code in zip(ax.patches, colors, codes):
bar.set_facecolor(color)
ax.text(
bar.get_x() + bar.get_width() / 2,
bar.get_height()- 0.2 + bar.get_y(), #height
code,
ha="center", va="top",
color = "w", weight= "light",)
ax.tick_params(axis='x', which='major', length=0, labelsize=14,colors= xy_ticklabel_color,pad =15)
ax.tick_params(axis='y', which='major', labelsize=14,colors= xy_ticklabel_color,pad =15)
ax.set_ylim(0,16)
ax.set_axisbelow(True) #set the grid lines in the BACK
ax.spines[['top','left','right','bottom']].set_visible(False)
ax.axhline(y=0, xmin =0, xmax=1, color = xy_ticklabel_color)
The result:

Reader Interactions