Вопрос

I am trying to write a script in Python which "cleans" a number of file-fed text like this:

for i in range(1,10):
    number = 1
    cleanText = re.sub('number.','',line).strip() 
    number = number + 1
    print cleanText

An example file would be: 1. Hello, World 2. Hello earth

What I need to do here is remove the numbering and the dots along with leading blank spaces in one fell swoop. But how on earth can I first perform a simple variable expansion?

Thank you all in advance.

Это было полезно?

Решение 2

As others said, you should simply use a regular expression that matches any number, such as r"\d" or r"\d+". However, for learning purposes, here is the answer to what you did ask.

The closest useful equivalent of "variable expansion" is the string formatting operator:

cleanText = re.sub('%d.' % number, line).strip()

You could also use str(number) + '.' to achieve the same effect. There are several more problems with your code:

  • your loop is wrong; if you're iterating over range(1, 10), then you don't need to increment number manually.

  • you probably meant range(1, 11).

  • . in regular expression syntax matches any characters; you want \..

A cleaned-up version might look like this:

cleanText = line.strip()
for i in xrange(1, 11):
    cleanText = re.sub(r'%d\.', '' , cleanText)

Другие советы

If your file format is guaranteed to be like you said:

1. Hello, World
2. Hello earth

You don't even need to use a regex, you could just use split and join:

clean_line = ' '.join(line.split(' ')[1:]).lstrip()

>>> ' '.join("1. Hello, world".split(' ')[1:])
'Hello, world'

Or, if you still wanted to do substitution, this replace-based code may work:

number = 1
for line in file_handle:
  clean_line = line.replace("%d. " % number, "").lstrip()
  number += 1
import re
fp = open('line','r')
for line in fp:
    pattern = re.match(r'[0-9]*\.(.*)',line)
    if pattern:
        print pattern.group(1)
    else:
        print line
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top