Question

EDIT: This question is different from other "capitalize first letter" questions because it requires capitalization only between "[" and "]". Since the title was incomplete, I have edited it.

I have a text file in which I need to reformat the text.

I have tried to loop lines and words while the file is open in 'r+', but have been unsuccessful.

Here is a sample:

Create Table Data(
    [SOME ID] int,
    [LAST NAME] varchar(30),
    [FIRST NAME] varchar(30),
    [TLA THING] smallint,
    [TLA THING REMARK] varchar(255)
)

I would like the first letter in each word between the [ ] to be capitalized. And as a bonus I'd love spaces between [ ] to be replaced with underscores.

code I tried:

f = open('somescript.sql','r+')
for line in f:
    for word in line:
        word.capitalize()

I also tried f.write(word.capitalize()) instead of just word.capitalize. All results were equally tragic.

Was it helpful?

Solution

The way I would code this :

  1. load the whole content of your file
  2. use the module re (re.sub would help) to transform parts that need to be
  3. overwrite the file with the transformed text

The implementation :

txt = # load your file
pattern = re.compile(r"\[(.*)\]")
transform = lambda mo : mo.group(0).title().replace(" ", "_")
new_txt = pattern.sub(transform, txt)
# write new text

OTHER TIPS

You can try using the .title() method asked here in a similar question. Also, make sure that you write the changes back to the file with f.write(). Just having the mode as r+ doesn't persist anything to the file for you.

f = open('somescript.sql','r+'):
text = f.read()
text = text.title()
f.write(text)
f.close()

You can open present file somescript.sql' in read mode. Read each line and process it e.g. if there is a column name then capitalized first latter and replace space by _ This can be done using regular expression. Latter you can delete old file and rename temp file as old-filed name.

script.py:

import os, re
with open("somescript.sql") as i: # open sql file for reading 
  with open("temp", "w") as o: # tem file for writing 
    for l in i: # read line by line 
      c = re.match(r".*\[(?P<col_name>.*)\].*", l) # use re to find col_name
      if c: # if column name found  
        c = c.group('col_name') # change col name 
        o.write(l.replace('['+c+']', '['+c.title().replace(' ', '_'))+']')
      else:       #         ^^ col name titled and replace every space by _  
        o.write(l)
os.remove("somescript.sql") # delete old file 
os.rename("temp", "somescript.sql")  # rename file

I did as follows, I have two files:

answer$ ls
script.py  somescript.sql

somescript file is:

answer$ cat somescript.sql 
Create Table Data(
    [SOME ID] int,
    [LAST NAME] varchar(30),
    [FIRST NAME] varchar(30),
    [TLA THING] smallint,
    [TLA THING REMARK] varchar(255)
)

$ python script.py  # run script 
/answer$ cat somescript.sql 
Create Table Data(
    [Some_Id] int,
    [Last_Name] varchar(30),
    [First_Name] varchar(30),
    [Tla_Thing] smallint,
    [Tla_Thing_Remark] varchar(255)
)

To explain: o.write(l.replace(c, c.title().replace(' ', '_')))

  1. o.write(x) write a string x in file
  2. l.replace(c, c.title().replace(' ', '_')) replace first argument c that is a column name by second argument c.title().replace(' ', '_'), Second argument is c titled followed by space is replaced by _.

Here is my solution to your problem, it uses regex to handle the actual replacements, but this could easily be completed by writing your own parser.

Using this as my test inp

text = '''Create Table Data(
    [lower case id] int,
    [loser case last name] varchar(30),
    [lower case first name] varchar(30),
    [lower case tla thing] smallint,
    [lower case tla thing remark] varchar(255)
)
'''

The process is then simply to simply format each match the regex expression makes.

def format_input(val):
    val = val.strip()
    val = val.split()
    new_val = ""
    for word in val:
        new_val += word[0].upper() + word[1:] + "_"
    return new_val[:-1] //Remove the trailing underscore


content = ""
with open('mySQLfile.sql','r') as f:
    for line in f:
        content += line

import re
content = re.sub(r'\[(.*?)\]',lambda m: '['+format_input(m.group(1))+']',content,re.M)

with open('mySQLfile.sql','w') as f:
    f.write(content)

And without the use of regex:

new_content = ""
buf = ""
in_tag = False
for i in content:
    if in_tag:
        buf += i
    else:
        new_content += i
    if i == '[':
        in_tag = True
    elif i == ']':
        in_tag = False
        new_content += format_input(buf)
        buf = ""
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top