Saya memiliki kerangka data, tetapi semua string diduplikasi dan ketika saya mencoba mencetak grafik, itu berisi kolom duplikat. Saya mencoba menghapusnya, tetapi grafik saya salah dicetak. CSV saya di sini.
Bingkai Data common_users
:
used_at common users pair of websites
0 2014 1364 avito.ru and e1.ru
1 2014 1364 e1.ru and avito.ru
2 2014 1716 avito.ru and drom.ru
3 2014 1716 drom.ru and avito.ru
4 2014 1602 avito.ru and auto.ru
5 2014 1602 auto.ru and avito.ru
6 2014 299 avito.ru and avtomarket.ru
7 2014 299 avtomarket.ru and avito.ru
8 2014 579 avito.ru and am.ru
9 2014 579 am.ru and avito.ru
10 2014 602 avito.ru and irr.ru/cars
11 2014 602 irr.ru/cars and avito.ru
12 2014 424 avito.ru and cars.mail.ru/sale
13 2014 424 cars.mail.ru/sale and avito.ru
14 2014 634 e1.ru and drom.ru
15 2014 634 drom.ru and e1.ru
16 2014 475 e1.ru and auto.ru
17 2014 475 auto.ru and e1.ru
.....
Anda dapat melihat bahwa nama situs web terbalik. Saya mencoba mengurutkannya berdasarkan pair of websites
dengan saya punya KeyError
. Saya menggunakan kode
df = pd.read_csv("avito_trend.csv", parse_dates=[2])
def f(df):
dfs = []
for x in [list(x) for x in itertools.combinations(df['address'].unique(), 2)]:
c1 = df.loc[df['address'].isin([x[0]]), 'ID']
c2 = df.loc[df['address'].isin([x[1]]), 'ID']
c = pd.Series(list(set(c1).intersection(set(c2))))
#add inverted intersection c2 vs c1
c_invert = pd.Series(list(set(c2).intersection(set(c1))))
dfs.append(pd.DataFrame({'common users':len(c), 'pair of websites':' and '.join(x)}, index=[0]))
#swap values in x
x[1],x[0] = x[0],x[1]
dfs.append(pd.DataFrame({'common users':len(c_invert), 'pair of websites':' and '.join(x)}, index=[0]))
return pd.concat(dfs)
common_users = df.groupby([df['used_at'].dt.year]).apply(f).reset_index(drop=True, level=1).reset_index()
graph_by_common_users = common_users.pivot(index='pair of websites', columns='used_at', values='common users')
#sort by column 2014
graph_by_common_users = graph_by_common_users.sort_values(2014, ascending=False)
ax = graph_by_common_users.plot(kind='barh', width=0.5, figsize=(10,20))
[label.set_rotation(25) for label in ax.get_xticklabels()]
rects = ax.patches
labels = [int(round(graph_by_common_users.loc[i, y])) for y in graph_by_common_users.columns.tolist() for i in graph_by_common_users.index]
for rect, label in zip(rects, labels):
height = rect.get_height()
ax.text(rect.get_width() + 3, rect.get_y() + rect.get_height(), label, fontsize=8)
plt.show()
Grafik saya terlihat seperti:
rects = ax1.patches labels = ["%d" % i for i in time['time online'].round()] for rect, label in zip(rects, labels): print rect, label height = rect.get_height() ax1.text(rect.get_x() + rect.get_width()/2, height + 5, label, ha='center', va='bottom')
Saya menjelaskan masalah saya di pertanyaan - person ldevyataykina   schedule 20.03.2016