21 of 100: Semicircular stacked bar chart in matplotlib
At the beginning of the year I challenged myself to create all 100 visualizations using python and matplotlib from the 1 dataset,100 visualizations project and I am sharing with you the code for all the visualizations.
Note: Data Viz Project is copyright Ferdio and available under a Creative Commons Attribution – Non Commercial – No Derivatives 4.0 International license. I asked Ferdio and they told me they used a Design tool to create all the plots.
Collaborate
There are a ton of improvements that can be made on the code, so let me know in the comments any improvements you make and I will update the post accordingly!
To be improved: To make the pies align I added white pies in between. Might be a better way, will explore later.
This is the original viz that we are trying to recreate in matplotlib:

Import the packages
We will need the following packages:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
Generate the data
I hardcoded the data until I have time to automate it. We need to insert in the chart both the white wedges (space between the color wedges) and the difference in sites.
color_dict = {(2004,"Norway"): "#9194A3", (2022,"Norway"): "#2B314D",
(2004,"Denmark"): "#E2AFA5", (2022,"Denmark"): "#A54836",
(2004,"Sweden"): "#C4D6F8", (2022,"Sweden"): "#5375D4",
}
xy_ticklabel_color, datalabels_color ='#101628', "#FFFFFF"
data = {
"year": [2004, 2004, 2022,2022, 2004, 2004, 2022, 2022,2004, 2004, 2022,2022,],
"countries" : ["Sweden", "Sweden", "Sweden", "Sweden", "Denmark", "Denmark", "Denmark", "Denmark", "Norway", "Norway","Norway", "Norway",],
"sites": [2,13,0, 15,6,4,0,10,3,5,0,8]
}
df= pd.DataFrame(data)
We need to create the subtotals for each year so we use pandas groupby and then sort the data.
df['sub_total'] = df.groupby('year')['sites'].transform('sum')
df = df.sort_values(['countries', 'year'], ascending=False ).reset_index(drop=True)
sort_order_dict = {"Denmark":2, "Sweden":1, "Norway":3}
df = df.sort_values(by=['countries'], key= lambda x:x.map(sort_order_dict))
year | countries | sites | color | |
---|---|---|---|---|
0 | 2022 | Sweden | 0 | #5375D4 |
1 | 2022 | Sweden | 15 | #5375D4 |
2 | 2022 | Denmark | 0 | #A54836 |
3 | 2022 | Denmark | 10 | #A54836 |
4 | 2022 | Norway | 0 | #2B314D |
5 | 2022 | Norway | 8 | #2B314D |
6 | 2004 | Sweden | 2 | #C4D6F8 |
7 | 2004 | Sweden | 13 | #C4D6F8 |
8 | 2004 | Denmark | 6 | #E2AFA5 |
9 | 2004 | Denmark | 4 | #E2AFA5 |
10 | 2004 | Norway | 3 | #9194A3 |
11 | 2004 | Norway | 5 | #9194A3 |
Define the variables:
years = df.year.unique()
countries = df.countries.unique()
sites =df.sites
widths = [0,0.25]
distances =[0.8,0.65]
#insert white color before each color to hide the fake bars
bar_colors = df.groupby(['year'], sort=False)['color'].unique()
bar_color = [np.insert(bc, range(0,3),'w') for bc in bar_colors]
country_colors = country_color = df.color[:-6].unique()
Plot the chart
fig, ax = plt.subplots(figsize=(6,6), facecolor = "#FFFFFF")
for year, color, width, year, distance in zip(years, bar_color, widths, years, distances):
sites= df[df.year ==year]['sites'].tolist()
sites.append(sum(sites)) # 50% blank
labels= df[df.year ==year]['sites'].astype(str).tolist()
labels= ["" if i % 2 == 0 else elem for i, elem in enumerate(labels)]
labels.append("")
wedges, texts = ax.pie(sites, radius= 1-width, startangle= -90,
wedgeprops=dict(width=0.2),
colors= color)
shift = 8
for lbl,wedge in zip(labels,wedges ):
angle_lbls = wedge.theta2 -shift
r=wedge.r-wedge.width/2
# convert polar to cartesian
x_lbl = r*np.cos(np.deg2rad(angle_lbls))
y_lbl = r*np.sin(np.deg2rad(angle_lbls))
ax.annotate(lbl, xy=(x_lbl,y_lbl), size=12, color=datalabels_color,
ha='center', va='center', weight='bold')
#Add the Year legend
for year,w in zip([year],wedges ):
angle =w.theta2
r=w.r-w.width/2
x = r*np.cos(np.deg2rad(angle)) #
y = r*np.sin(np.deg2rad(angle)) #
ax.annotate(f"{year} ", xy=(0,r), ha="right", color=xy_ticklabel_color)
#Add the country labels
label_spacing = 0.3
country_labels = { 2: "Sweden",4: "Denmark", 6: "Norway",}
for key, color in zip(country_labels, country_colors):
w = wedges[key-1]
angle = w.theta2
r = w.r + w.width + label_spacing
x = r * np.cos(np.deg2rad(angle))
y = r * np.sin(np.deg2rad(angle))
ang = np.deg2rad((w.theta1 + w.theta2)/2)
y_line = np.sin(ang)
x_line = np.cos(ang)
ax.annotate(country_labels[key] + " ", xy=(x, y), size=10, color= color,
ha='left', va='center',)
ax.annotate("", xy=(0.4*x, 0.4*y), xytext=(0.85*x, 0.85*y),
arrowprops=dict(arrowstyle="-", color = grid_color, lw= 0.5), va="center")
ax.axvline(0, 0.1, 0.7, color = grid_color, lw= 0.5)
The result:

Could you also separate the Countries with a line between them as in the original viz you are recreating?
Yes! Surely, i just forgot about that. Will do!
I added the lines now 🙂