更新时间:2023-02-05 13:45:44
你可以 使用nltk进行分词:
#!/usr/bin/env python3
import textwrap
from pprint import pprint
import nltk.data # $ pip install http://www.nltk.org/nltk3-alpha/nltk-3.0a3.tar.gz
# python -c "import nltk; nltk.download('punkt')"
sent_tokenizer = nltk.data.load('tokenizers/punkt/english.pickle')
text = input('Enter a sentence/sentences please:')
print("\n" + textwrap.fill(text))
sentences = sent_tokenizer.tokenize(text)
sentences = [sent.capitalize() for sent in sentences]
pprint(sentences)
Enter a sentence/sentences please:
a period might occur inside a sentence e.g., see! and the sentence may
end without the dot!
['A period might occur inside a sentence e.g., see!',
'And the sentence may end without the dot!']