0203Regular expression

Posted Aug 26, 2023

By gold-duo 1 min read

贪婪与非贪婪匹配p67

  
#1.贪婪匹配 *后加?
content='Hello 1234567 World This is a Regex Demo'
result1= re.match("^He.*(\d+).*Demo$",content)    #只匹配到数字7，.*尽可能匹配多的字符串只给\d+留下一个满足条件的7
result2= re.match("^He.*?(\d+).*Demo$",content)   #正常匹配 1234567
print("result1",r1.group(1))
print("result2",r2.group(1))

#2.非贪婪匹配.*?, 如果匹配的结果在字符串结尾，有可能匹配不到任何内容
content='http://weibo.com/comment/kEraCN'
r1=re.match('http.*?comment/(.*?)',content) #匹配不到任何内容
r2=re.match('http.*?comment/(.*)',content)  #正常匹配 kEraCN
print("result1",r1.group(1))
print("result2",r2.group(1))

修饰符 p69

修饰符	描述
re.I	大小写敏感
re.M	匹配多行。影响^和$
re.S	匹配包括换行符在内的所有自负

re的方法和其作用 p66-73

方法	描述	其他
match	从字符串的起始位置匹配表达式	如果开始位置匹配就返回结果，如果开始位置不匹配则返回None
search	扫描整个字符串，返回第一个匹配结果
findall	search的进化版返回所有匹配结果	finadall返回列表类型或者None
sub	正则表达式的replace	例子re.sub(“\d+”,”“,text)将 text 里的所有数字去除掉
compile	编译正则表达式	例子pat=re.compile(“\d{2}:\d{2}”) r=re.sub(pat,’‘,text)

python, 网络爬虫开发实战

This post is licensed under CC BY 4.0 by the author.

贪婪与非贪婪匹配p67

修饰符 p69

re的方法和其作用 p66-73

Trending Tags