كيف أقيم بيضاء؟

https://stackoverflow.com/questions/1185524

19-09-2019
|

سؤال

هل هناك وظيفة بيثون التي سوف تقليم بيضاء (مسافات وعلامات تبويب) من سلسلة؟

مثال: \t example string\t → example string

المحلول

Whitespace على كلا الجانبين:

s = "  \t a string example\t  "
s = s.strip()

Whitespace على الجانب الأيمن:

s = s.rstrip()

Whitespace على الجانب الأيسر:

s = s.lstrip()

كما thedz. يشير، يمكنك تقديم حجة لشريط الأحرف التعسفية إلى أي من هذه الوظائف مثل هذا:

s = s.strip(' \t\n\r')

هذا سوف تجريد أي مساحة، \t, \n, ، أو \r الأحرف من الجانب الأيسر، الجانب الأيمن، أو جانبي السلسلة.

الأمثلة أعلاه فقط إزالة السلاسل من الجانبين الأيسر والأيمن من الأوتار. إذا كنت تريد أيضا إزالة الأحرف من منتصف السلسلة، فحاول re.sub:

import re
print re.sub('[\s+]', '', s)

يجب طباعة ذلك:

astringexample

نصائح أخرى

بيثون trim الأسلوب يسمى strip:

str.strip() #trim
str.lstrip() #ltrim
str.rstrip() #rtrim

لقيادة وذهبية Whitespace:

s = '   foo    \t   '
print s.strip() # prints "foo"

خلاف ذلك، يعمل التعبير العادي:

import re
pat = re.compile(r'\s+')
s = '  \t  foo   \t   bar \t  '
print pat.sub('', s) # prints "foobar"

يمكنك أيضا استخدام وظيفة بسيطة للغاية وأساسية: str.Reply (), ، يعمل مع Whitespaces وعلامات التبويب:

>>> whitespaces = "   abcd ef gh ijkl       "
>>> tabs = "        abcde       fgh        ijkl"

>>> print whitespaces.replace(" ", "")
abcdefghijkl
>>> print tabs.replace(" ", "")
abcdefghijkl

بسيطة وسهلة.

#how to trim a multi line string or a file

s=""" line one
\tline two\t
line three """

#line1 starts with a space, #2 starts and ends with a tab, #3 ends with a space.

s1=s.splitlines()
print s1
[' line one', '\tline two\t', 'line three ']

print [i.strip() for i in s1]
['line one', 'line two', 'line three']




#more details:

#we could also have used a forloop from the begining:
for line in s.splitlines():
    line=line.strip()
    process(line)

#we could also be reading a file line by line.. e.g. my_file=open(filename), or with open(filename) as myfile:
for line in my_file:
    line=line.strip()
    process(line)

#moot point: note splitlines() removed the newline characters, we can keep them by passing True:
#although split() will then remove them anyway..
s2=s.splitlines(True)
print s2
[' line one\n', '\tline two\t\n', 'line three ']

لم ينشر أحد حلول Regex هذه بعد.

المطابقة:

>>> import re
>>> p=re.compile('\\s*(.*\\S)?\\s*')

>>> m=p.match('  \t blah ')
>>> m.group(1)
'blah'

>>> m=p.match('  \tbl ah  \t ')
>>> m.group(1)
'bl ah'

>>> m=p.match('  \t  ')
>>> print m.group(1)
None

البحث (يجب عليك التعامل مع حالة إدخال "المسافات فقط" بشكل مختلف):

>>> p1=re.compile('\\S.*\\S')

>>> m=p1.search('  \tblah  \t ')
>>> m.group()
'blah'

>>> m=p1.search('  \tbl ah  \t ')
>>> m.group()
'bl ah'

>>> m=p1.search('  \t  ')
>>> m.group()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'group'

كما ترى re.sub, ، يمكنك إزالة Whitespace الداخلية، والتي قد تكون غير مرغوب فيها.

Whitespace يشمل الفضاء، علامات التبويب و crlf. وبعد لذلك أنيق و واحد بطانة وظيفة سلسلة يمكننا استخدامها يترجم.

' hello apple'.translate(None, ' \n\t\r')

أو إذا كنت تريد أن تكون شاملا

import string
' hello  apple'.translate(None, string.whitespace)

(RE.SUB ('+'، ''، '، (my_str.Replace (' n '،' ')).). الشريط ()

سيؤدي ذلك إلى إزالة جميع المساحات غير المرغوب فيها وأحرف خطوط نيولين. نأمل أن تكون هذه المساعدة

import re
my_str = '   a     b \n c   '
formatted_str = (re.sub(' +', ' ',(my_str.replace('\n',' ')))).strip()

هذا سيؤدي إلى:

"ب NC" سيتم تغييرها إلى 'ABC'

    something = "\t  please_     \t remove_  all_    \n\n\n\nwhitespaces\n\t  "

    something = "".join(something.split())

الإخراج: يرجى_remove_all_whitespaces.

إذا كنت تستخدم Python 3: في بيان الطباعة الخاص بك، إنهاء مع Sep = "". التي سوف تفصل جميع المساحات.

مثال:

txt="potatoes"
print("I love ",txt,"",sep="")

هذا سوف يطبع:أنا أحب البطاطس.

بدلا من:أنا أحب البطاطس .

في حالتك، نظرا لأنك تحاول الحصول على رحلة T، قم بتنفيذ SEP = " t"

محاولة ترجمة

>>> import string
>>> print '\t\r\n  hello \r\n world \t\r\n'

  hello 
 world  
>>> tr = string.maketrans(string.whitespace, ' '*len(string.whitespace))
>>> '\t\r\n  hello \r\n world \t\r\n'.translate(tr)
'     hello    world    '
>>> '\t\r\n  hello \r\n world \t\r\n'.translate(tr).replace(' ', '')
'helloworld'

إذا كنت ترغب في تقليم المسافة البضجة قبالة بداية ونهاية السلسلة، فيمكنك القيام بشيء من هذا القبيل:

some_string = "    Hello,    world!\n    "
new_string = some_string.strip()
# new_string is now "Hello,    world!"

يعمل هذا كثيرا مثل طريقة QT Qstring :: قلصت ()، في ذلك، يزيل مسافة بيضاء قيادة وخلفية، مع ترك Whitespace داخلية وحدها.

ولكن إذا كنت ترغب في شيء مثل QT's Qstring :: طريقة مبسطة () لا تزيل فقط بيض المسافة المسافة بيضاء، ولكن أيضا "استقبال" جميع المسببات داخلية متتالية على حرف مساحة واحدة، يمكنك استخدام مزيج من .split() و " ".join, ، مثله:

some_string = "\t    Hello,  \n\t  world!\n    "
new_string = " ".join(some_string.split())
# new_string is now "Hello, world!"

في هذا المثال الأخير، استبدال كل تسلسل من المستويات الداخلية بمساحة واحدة، بينما لا تزال تقليم المسافة بيضاء خارج بداية وتنتهي السلسلة.

بشكل عام، أستخدم الطريقة التالية:

>>> myStr = "Hi\n Stack Over \r flow!"
>>> charList = [u"\u005Cn",u"\u005Cr",u"\u005Ct"]
>>> import re
>>> for i in charList:
        myStr = re.sub(i, r"", myStr)

>>> myStr
'Hi Stack Over  flow'

ملاحظة: هذا فقط لإزالة " n"، " r" و " t" فقط. لا يزيل مسافات إضافية.

لإزالة Whitespaces من منتصف السلسلة

$p = "ATGCGAC ACGATCGACC";
$p =~ s/\s//g;
print $p;

الإخراج: ATGCCCACACGATCGACC.

سيؤدي ذلك إلى إزالة جميع Whitespace و Newlines من بداية وتنتهي السلسلة:

>>> s = "  \n\t  \n   some \n text \n     "
>>> re.sub("^\s+|\s+$", "", s)
>>> "some \n text"

مرخصة بموجب: CC-BY-SA مع الإسناد

لا تنتمي إلى StackOverflow