且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何检查一个句子是否正确(Python 中的简单语法检查)?

更新时间:2023-01-07 11:50:00

查看 NLTK.他们支持您可以用来解析句子的语法.您可以定义一种文法,或使用提供的文法以及上下文无关的解析器.如果句子解析,则它具有有效的语法;如果没有,那么就没有.这些语法可能没有最广泛的覆盖范围(例如,它可能不知道如何处理像 *** 这样的词),但是这种方法将允许您具体说明语法中什么是有效的或无效的.NLTK 书籍的 第 8 章介绍了解析,应该解释您需要了解的内容.>

另一种方法是编写一个覆盖广泛的解析器的python接口(如斯坦福解析器C&C).这些是统计解析器,即使他们之前没有见过所有的单词或所有的语法结构,它们也能够理解句子.缺点是有时解析器仍会为语法错误的句子返回解析,因为它会使用统计信息做出***猜测.

因此,这实际上取决于您的目标是什么.如果您想非常精确地控制被认为是语法的内容,请使用带有 NLTK 的上下文无关解析器.如果您想要健壮性和广泛的覆盖范围,请使用统计解析器.

How to check whether a sentence is valid in Python?

Examples:

I love *** - Correct
I *** love - Incorrect

Check out NLTK. They have support for grammars that you can use to parse your sentence. You can define a grammar, or use one that is provided, along with a context-free parser. If the sentence parses, then it has valid grammar; if not, then it doesn't. These grammars may not have the widest coverage (eg, it might not know how to handle a word like ***), but this approach will allow you to say specifically what is valid or invalid in the grammar. Chapter 8 of the NLTK book covers parsing and should explain what you need to know.

An alternative would be to write a python interface to a wide-coverage parser (like the Stanford parser or C&C). These are statistical parsers that will be able to understand sentences even if they haven't seen all the words or all the grammatical constructions before. The downside is that sometimes the parser will still return a parse for a sentence with bad grammar because it will use the statistics to make the best guess possible.

So, it really depends on exactly what your goal is. If you want very precise control over what is considered grammatical, use a context-free parser with NLTK. If you want robustness and wide-coverage, use a statistical parser.