V2EX = way to explore
V2EX 是一个关于分享和探索的地方
现在注册
已注册用户请  登录
推荐学习书目
Learn Python the Hard Way
Python Sites
PyPI - Python Package Index
http://diveintopython.org/toc/index.html
Pocoo
值得关注的项目
PyPy
Celery
Jinja2
Read the Docs
gevent
pyenv
virtualenv
Stackless Python
Beautiful Soup
结巴中文分词
Green Unicorn
Sentry
Shovel
Pyflakes
pytest
Python 编程
pep8 Checker
Styles
PEP 8
Google Python Style Guide
Code Style from The Hitchhiker's Guide
outman87
V2EX  ›  Python

[救救菜鸟] 怎么利用正则表达式截取 A 字符串之后 B 字符串(若存在)之前的子串

  •  
  •   outman87 · 2022-01-07 10:14:39 +08:00 · 1559 次点击
    这是一个创建于 1053 天前的主题,其中的信息可能已经有所发展或是发生改变。
    Caused by: errorCode=QUERYILLEGAL0001 errorMsg= codeMsg=QRYILLEGAL0096
    Caused by: java.lang.ArrayIndexOutOfBoundsException
    以上两个字符串,想要截取(非匹配)Caused by: 之后的内容,直到遇见 errorMsg (倘若存在)。请问正则表达式应该怎么写?

    str1 = 'Caused by: errorCode=QUERYILLEGAL0001 errorMsg= codeMsg=QRYILLEGAL0096'
    str2 = "Caused by: java.lang.ArrayIndexOutOfBoundsException"
    res1 = re.findall(r'Caused by: ((?!errorMsg=).)*', str1)
    res2 = re.findall(r'Caused by: (?:(?!errorMsg=).)*', str1)
    res3 = re.findall(r'Caused by: ((?:(?!errorMsg=).)*)', str1)
    print(res1)
    print(res2)
    print(res3)
    res4 = re.findall(r'Caused by: ((?!errorMsg=).)*', str2)
    res5 = re.findall(r'Caused by: (?:(?!errorMsg=).)*', str2)
    res6 = re.findall(r'Caused by: ((?:(?!errorMsg=).)*)', str2)
    print(res4)
    print(res5)
    print(res6)

    =================== RESTART: C:\Users\anonymous\Desktop\test.py ===================
    [' ']
    ['Caused by: errorCode=QUERYILLEGAL0001 ']
    ['errorCode=QUERYILLEGAL0001 ']
    ['n']
    ['Caused by: java.lang.ArrayIndexOutOfBoundsException']
    ['java.lang.ArrayIndexOutOfBoundsException']

    三种写法:
    re.findall(r'Caused by: ((?!errorMsg=).)*', str)
    re.findall(r'Caused by: (?:(?!errorMsg=).)*', str)
    re.findall(r'Caused by: ((?:(?!errorMsg=).)*)', str)

    只有奇葩的第三种满足需求。为什么第一种得不到想要的结果?请问还有其它更“优雅”更简约的写法吗?

    感谢!!
    b1iy
        1
    b1iy  
       2022-01-07 10:26:01 +08:00
    盲猜
    regex = r"(?<=Caused\sby:)(\N+(?=errorMsg)|\N+)"
    b1iy
        2
    b1iy  
       2022-01-07 10:27:05 +08:00
    复制错了

    regex = r"(?<=Caused\sby:)(.+(?=errorMsg)|.+)"
    outman87
        3
    outman87  
    OP
       2022-01-07 10:57:56 +08:00
    @b1iy 可以用!感谢老师。前瞻基本上没用过,我自己琢磨下。
    关于   ·   帮助文档   ·   博客   ·   API   ·   FAQ   ·   实用小工具   ·   3584 人在线   最高记录 6679   ·     Select Language
    创意工作者们的社区
    World is powered by solitude
    VERSION: 3.9.8.5 · 22ms · UTC 10:54 · PVG 18:54 · LAX 02:54 · JFK 05:54
    Developed with CodeLauncher
    ♥ Do have faith in what you're doing.