更新时间:2022-12-09 15:11:10
使用 Series.str.extract
创建可以合并的匹配列.然后把小组带过来.删除合并之前已经存在的'GROUP'
列,为了清楚起见,我将'match'
列留在其中.
Use Series.str.extract
to create the matching column you can merge on. Then bring the group over. Remove the 'GROUP'
column that already exists before the merge, and I left the 'match'
column in for clarity.
在多个子字符串匹配的情况下,因为它使用 .str.extract
,它将仅与第一个子字符串匹配合并.(可以使用 .str.extractall
和一些groupby来处理多个匹配项,以将所有内容组合到一个列表中.)
In the case of multiple substring matches, because this uses .str.extract
it will merge with only the first substring match. (Multple matches can be handled with .str.extractall
and some groupby to combine everything into, say, a list.)
pat = '(' + '|'.join(df1['NAME']) +')'
df2['match'] = df2['NAME'].str.extract(pat)
df2 = df2.drop(columns='GROUP').merge(df1.rename(columns={'NAME': 'match'}), how='left')
print(df2)
NAME match GROUP
0 AA A A1
1 AAA A A1
2 AAAA A A1
3 BB B B1
4 BBB B B1
5 BBBB B B1
6 CC C C1
7 CCC C C1
8 CCCC C C1
9 DD D D1
10 DDD D D1
11 DDDD D D1