更新时间:2023-11-21 23:03:46
pandas.crosstab
重塑数据帧,并使用pandas.DataFrame.plot
与kind='bar'
和stacked=True
绘制
plt.hist
来实现,因为它比较复杂,直接使用 pandas 图方法更容易。ct.iloc[:, :-1]
选择除最后一列'tot'
以外的所有列以条形图绘制。matplotlib.pyplot.bar_label
添加批注
python 3.10
、pandas 1.3.5
、matplotlib 3.5.1
import pandas as pd
# load from github repo link
url = 'https://raw.githubusercontent.com/jpiedehierroa/files/main/Libro1.csv'
df = pd.read_csv(url)
# reshape the dataframe
ct = pd.crosstab(df.countries, df.type)
# total medals per country, which is necessary to sort the bars
ct['tot'] = ct.sum(axis=1)
# sort
ct = ct.sort_values(by='tot', ascending=False)
# display(ct)
type bronze gold silver tot
countries
USA 33 39 41 113
China 18 38 32 88
ROC 23 20 28 71
GB 22 22 21 65
Japan 17 27 14 58
Australia 22 17 7 46
Italy 20 10 10 40
Germany 16 10 11 37
Netherlands 14 10 12 36
France 11 10 12 33
colors = ("#CD7F32", "silver", "gold")
cd = dict(zip(ct.columns, colors))
# plot the medals columns
title = 'Country Medal Count for Tokyo 2020'
ax = ct.iloc[:, :-1].plot(kind='bar', stacked=True, color=cd, title=title,
figsize=(12, 5), rot=0, width=1, ec='k' )
# annotate each container with individual values
for c in ax.containers:
ax.bar_label(c, label_type='center')
# annotate the top containers with the cumulative sum
ax.bar_label(ax.containers[2], padding=3)
# pad the spacing between the number and the edge of the figure
ax.margins(y=0.1)
'tot'
列,但如图所示,这不是必需的。labels = ct.tot.tolist()
ax.bar_label(ax.containers[2], labels=labels, padding=3)