# open inputfile/read/auto-close
with open('inputfile.txt') as f:
lines = f.readlines() # with block auto closes file after block is executed
output = []
for line in lines:
if len(line) > 110:
while True: # until break
output.append(line[:107] + '...')
if len(line[107:]) < 111: # if remainder of line is under 110 chars
output.append('...' + line[107:])
break
line = line[107:] # otherwise loop continues with new line definition
else:
output.append(line)
# open outputfile/write/auto-closed
with open('outputfile.txt', 'w') as f:
for line in output:
f.write(line)
Python splitting text into blocks of x characters
-
06-07-2023 - |
Question
I'm using this code to parse a text file and format it in a way that puts every sentence in a new line:
import re
# open the file to be formatted
filename=open('inputfile.txt','r')
f=filename.read()
filename.close()
# put every sentence in a new line
pat = ('(?<!Dr)(?<!Esq)\. +(?=[A-Z])')
lines = re.sub(pat,'.\n',f)
print lines
# write the formatted text
# into a new txt file
filename = open("outputfile.txt", "w")
filename.write(lines)
filename.close()
But essentially I need to split the sentences after 110 characters. So in case when a sentence in a line is longer than 110, it would split it and add '...' in the end, and then start a new line with '...' and following other part of the splitted sentence, and so on.
Any suggestions how to do that? I'm somehow lost.
Solution
OTHER TIPS
I don't know the content of "lines", but, if this is not a list with each line, you need to split all the lines in a list.
After you have a list with those strings (lines), you can verify how many characteres are in the string, and if is more then 110, you get the 107 firsts and put '...' at the end. Like this:
for i in xrange(0, len(lines)):
string_line = lines[i]
if len(string_line) > 110:
new_string = "{0}...".format(string_line[:107])
lines[i] = new_string
Explaning:
if you do this:
string = "Hello"
print len(string)
result will be: 5
print string[:3]
result will be: "Hel"
You can't insert in the same file in python. Something like this will do what you describe.
WARNING: make a backup of the file before as the existing file will be replaced.
from shutil import move
import os
insert=" #####blabla#### "
insert_position=110
targetfile="full/path/to/target_file"
tmpfile="/full/path/to/tmp.txt"
output=open(tmpfile,"w")
with open(targetfile,"r") as myfile:
for line in myfile:
if len(line) >insert_position:
line=line[:insert_position+1] + insert + "\n" + line[insert_position+1:]
myfile.write
output.write(line)
output.close()
move(tmpfile,targetfile)