Python正则表达式，用于查找MediaWiki标记链接的内容

更新时间：2021-11-09 21:59:32

以下是示例

import re

pattern = re.compile(r"\[\[([\w \|]+)\]\]")
text = "blah blah [[Alexander of Paris|poet named Alexander]] bldfkas"
results = pattern.findall(text)

output = []
for link in results:
    output.append(link.split("|")[0])

# outputs ['Alexander of Paris']

第2版将更多内容添加到正则表达式中，但结果是更改了输出:

Version 2, puts more into the regex, but as a result, changes the output:

import re

pattern = re.compile(r"\[\[([\w ]+)(\|[\w ]+)?\]\]")
text = "[[a|b]] fdkjf [[c|d]] fjdsj [[efg]]"
results = pattern.findall(text)

# outputs [('a', '|b'), ('c', '|d'), ('efg', '')]

print [link[0] for link in results]

# outputs ['a', 'c', 'efg']

版本3，如果您只希望链接不带标题.

Version 3, if you only want the link without the title.

pattern = re.compile(r"\[\[([\w ]+)(?:\|[\w ]+)?\]\]")
text = "[[a|b]] fdkjf [[c|d]] fjdsj [[efg]]"
results = pattern.findall(text)

# outputs ['a', 'c', 'efg']

上一篇 : ：如何遍历正则表达式匹配组下一篇 : 生成正则表达式

Python正则表达式，用于查找MediaWiki标记链接的内容

相关阅读

技术问答最新文章