有没有在python字符串崩溃的库函数？

https://stackoverflow.com/questions/1249786

12-09-2019
|

题

是否有一个跨平台的库函数，将折叠多行字符串到单个行字符串不带空格重复？

我已经想出了下面的一些剪断，但我不知道是否有一个标准的功能，我可以只是这甚至可能是用C优化进口？

def collapse(input):
    import re
    rn = re.compile(r'(\r\n)+')
    r = re.compile(r'\r+')
    n = re.compile(r'\n+')
    s = re.compile(r'\ +')
    return s.sub(' ',n.sub(' ',r.sub(' ',rn.sub(' ',input))))

P.S。感谢您的好意见。 ' '.join(input.split())似乎成为最后的赢家，因为它实际运行速度更快我的情况比较了预编译r'\s+'正则表达式查找替换的两倍左右。

解决方案

内置string.split()方法各执空白的运行，这样你就可以使用，然后使用空间，这样加入结果列表：

' '.join(my_string.split())

下面是一个完整的测试脚本：

TEST = """This
is        a test\twith a
  mix of\ttabs,     newlines and repeating
whitespace"""

print ' '.join(TEST.split())
# Prints:
# This is a test with a mix of tabs, newlines and repeating whitespace

其他提示

您当时的想法是，你只需要多一点仔细阅读蟒蛇手册：

import re
somewhitespace = re.compile(r'\s+')
TEST = """This
is        a test\twith a
  mix of\ttabs,     newlines and repeating
whitespace"""

somewhitespace.sub(' ', TEST)

'This is a test with a mix of tabs, newlines and repeating whitespace'

multi_line.replace('\n', '')

将做的工作。 '\n'是行字符的在python的通用端。

许可以下： CC-BY-SA 和归因

不隶属于 StackOverflow