且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

如何捕捉一组中最长的序列

更新时间:2023-01-10 09:53:48

使用:

import re

sequence = "AGATCAGATCTTTTTTCTAATGTCTAGGATATATCAGATCAGATCAGATCAGATCAGATC"
matches = re.findall(r'(?:AGATC)+', sequence)

# To find the longest subsequence
longest = max(matches, key=len)

说明:

非捕获组(?: AGATC)+


  • + 量词-一次和无限次匹配,例如

  • AGATC 字面上匹配字符AGATC(区分大小写)

  • + Quantifier — Matches between one and unlimited times, as many times as possible.
  • AGATC matches the characters AGATC literally (case sensitive)

结果:

# print(matches)
['AGATCAGATC', 'AGATCAGATCAGATCAGATCAGATC']

# print(longest)
'AGATCAGATCAGATCAGATCAGATC'

您可以测试正则表达式 此处

You can test the regex here.