更新时间:2023-02-26 09:11:36
如果我理解正确,您只需要将句子拆分为单词,循环遍历每个单词,并检查它是否以所需字符结尾或以例如字符开头:
If I understand correctly, you just have to split up the sentence into words, loop over each one and check if it ends or starts with the required characters, e.g:
>>> sentence = ['AASFG', 'BBBSDC', 'FEKGG', 'SDFGF']
>>> [word for word in sentence.split() if word.endswith("GF")]
['SDFGF']
sentence.split()
可能会替换为nltk.tokenize.word_tokenize(sentence)
更新,关于评论:
如何在单词的前面和后面得到单词
How can get word in-front of that and behind it
enumerate
函数可用于为每个单词赋予一个数字,如下所示:
The enumerate
function can be used to give each word a number, like this:
>>> print list(enumerate(sentence))
[(0, 'AASFG'), (1, 'BBBSDC'), (2, 'FEKGG'), (3, 'SDFGF')]
然后,如果您执行相同的循环,但保留索引:
Then if you do the same loop, but preserve the index:
>>> results = [(idx, word) for (idx, word) in enumerate(sentence) if word.endswith("GG")]
>>> print results
[(2, 'FEKGG')]
..您可以使用索引获取下一个或上一个项目:
..you can use the index to get the next or previous item:
>>> for r in results:
... r_idx = r[0]
... print "Prev", sentence[r_idx-1]
... print "Next", sentence[r_idx+1]
...
Prev BBBSDC
Next SDFGF
您需要处理第一个或最后一个单词(if r_idx == 0
,if r_idx == len(sentence)
)匹配的情况
You'd need to handle the case where the match the very first or last word (if r_idx == 0
, if r_idx == len(sentence)
)