Print

21 of 100: Semicircular stacked bar chart in matplotlib

At the beginning of the year I challenged myself to create all 100 visualizations using python and matplotlib from the 1 dataset,100 visualizations project and I am sharing with you the code for all the visualizations.

Note: Data Viz Project is copyright Ferdio and available under a Creative Commons Attribution – Non Commercial – No Derivatives 4.0 International license. I asked Ferdio and they told me they used a Design tool to create all the plots.

Collaborate

There are a ton of improvements that can be made on the code, so let me know in the comments any improvements you make and I will update the post accordingly!

To be improved: To make the pies align I added white pies in between. Might be a better way, will explore later.

This is the original viz that we are trying to recreate in matplotlib:

Import the packages

We will need the following packages:

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

Generate the data

I hardcoded the data until I have time to automate it. We need to insert in the chart both the white wedges (space between the color wedges) and the difference in sites.

color_dict = {(2004,"Norway"): "#9194A3", (2022,"Norway"): "#2B314D",
              (2004,"Denmark"): "#E2AFA5", (2022,"Denmark"): "#A54836",
              (2004,"Sweden"): "#C4D6F8", (2022,"Sweden"): "#5375D4",
              }

xy_ticklabel_color,  datalabels_color ='#101628', "#FFFFFF"

data = {
    "year": [2004, 2004, 2022,2022, 2004, 2004, 2022, 2022,2004, 2004, 2022,2022,],
    "countries" : ["Sweden", "Sweden", "Sweden", "Sweden", "Denmark", "Denmark", "Denmark", "Denmark", "Norway", "Norway","Norway", "Norway",],
    "sites": [2,13,0, 15,6,4,0,10,3,5,0,8]
}

df= pd.DataFrame(data)

We need to create the subtotals for each year so we use pandas groupby and then sort the data.

df['sub_total'] = df.groupby('year')['sites'].transform('sum')
df = df.sort_values(['countries', 'year'], ascending=False ).reset_index(drop=True)
sort_order_dict =  {"Denmark":2, "Sweden":1, "Norway":3}
df = df.sort_values(by=['countries'], key= lambda x:x.map(sort_order_dict))
yearcountriessitescolor
02022Sweden0#5375D4
12022Sweden15#5375D4
22022Denmark0#A54836
32022Denmark10#A54836
42022Norway0#2B314D
52022Norway8#2B314D
62004Sweden2#C4D6F8
72004Sweden13#C4D6F8
82004Denmark6#E2AFA5
92004Denmark4#E2AFA5
102004Norway3#9194A3
112004Norway5#9194A3

Define the variables:

years = df.year.unique()
countries = df.countries.unique()
sites =df.sites
widths = [0,0.25]
distances =[0.8,0.65]

#insert white color before each color to hide the fake bars
bar_colors = df.groupby(['year'], sort=False)['color'].unique()
bar_color = [np.insert(bc, range(0,3),'w') for bc in bar_colors]
country_colors = country_color = df.color[:-6].unique()

Plot the chart

fig, ax = plt.subplots(figsize=(6,6), facecolor = "#FFFFFF")

for year, color, width, year, distance in zip(years, bar_color, widths, years,  distances):
    sites= df[df.year ==year]['sites'].tolist()
    sites.append(sum(sites))  # 50% blank
    labels= df[df.year ==year]['sites'].astype(str).tolist()
    labels= ["" if i % 2 == 0 else elem for i, elem in enumerate(labels)]
    labels.append("")
    wedges, texts = ax.pie(sites, radius= 1-width, startangle= -90,
            wedgeprops=dict(width=0.2),
           colors= color)
    
    shift = 8
    for lbl,wedge in zip(labels,wedges ):
        angle_lbls = wedge.theta2 -shift 
        r=wedge.r-wedge.width/2      
        # convert polar to cartesian
        x_lbl = r*np.cos(np.deg2rad(angle_lbls)) 
        y_lbl = r*np.sin(np.deg2rad(angle_lbls)) 
         
        ax.annotate(lbl, xy=(x_lbl,y_lbl), size=12, color=datalabels_color,
                    ha='center', va='center', weight='bold')
    
    #Add the Year legend
    for year,w in zip([year],wedges ):
        angle =w.theta2
        r=w.r-w.width/2
        x = r*np.cos(np.deg2rad(angle))  #
        y  = r*np.sin(np.deg2rad(angle))  #
        ax.annotate(f"{year}  ", xy=(0,r), ha="right", color=xy_ticklabel_color)

#Add the country labels
label_spacing = 0.3
country_labels = { 2: "Sweden",4: "Denmark", 6: "Norway",}
for key, color in zip(country_labels, country_colors):
    w = wedges[key-1]
    angle = w.theta2 
    r = w.r + w.width + label_spacing
    x = r * np.cos(np.deg2rad(angle))
    y = r * np.sin(np.deg2rad(angle))
    ang = np.deg2rad((w.theta1 + w.theta2)/2)
    y_line = np.sin(ang)
    x_line = np.cos(ang)
    ax.annotate(country_labels[key] + " ", xy=(x, y), size=10, color= color,
                ha='left', va='center',)
    ax.annotate("", xy=(0.4*x, 0.4*y), xytext=(0.85*x, 0.85*y),
                arrowprops=dict(arrowstyle="-", color = grid_color, lw= 0.5), va="center")
   
    
ax.axvline(0, 0.1, 0.7, color = grid_color, lw= 0.5)

The result:

21 of 100: Semicircular stacked bar chart in matplotlib
Was this helpful?

Reader Interactions

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Table of Contents