Question

I am attempting to write elements from a nested list to individual lines in a file, with each element separated by tab characters. Each of the nested lists is of the following form:

('A', 'B', 'C', 'D')

The final output should be of the form:

A    B    C    D
E    F    G    H
.    .    .    .
.    .    .    .

However, my output seems to have reproducible inconsistencies such that the output is of the general form:

A    B    C    D
E    F    G H
I    J    K L
M    N    O    P
.    .    .    .
.    .    .    .

I've inspected the lists before writing and they seem identical in form. The code I'm using to write is:

with open("letters.txt", 'w') as outfile:
    outfile.writelines('\t'.join(line) + '\n' for line in letter_list)

Importantly, if I replace '\t' with, for example, '|', the file is created without such inconsistencies. I know whitespace parsing can become an issue for certain file I/O operations, but I don't know how to troubleshoot it here.

Thanks for the time.

EDIT: Here is some actual input data (in nested-list form) and output:

IN

('5', '+', '5752624-5752673', 'alt_region_8161'), ('1', '+', '621461-622139', 'alt_region_67'), ('1', '+', '453907-454063', 'alt_region_60'), ('1', '+', '539611-539815', 'alt_region_61'), ('4', '+', '14610049-14610103', 'alt_region_6893'), ('4', '+', '14610049-14610144', 'alt_region_6895'), ('4', '+', '14610049-14610144', 'alt_region_6897'), ('4', '+', '14610049-14610144', 'alt_region_6896')]

OUT

4   +   12816011-12816087   alt_region_6808
1   +   21214720-21214747   alt_region_2377
4   +   9489968-9490833 alt_region_7382
1   +   12121545-12126263   alt_region_650
4   +   9489968-9490811 alt_region_7381
4   +   12816011-12816087   alt_region_6807
1   +   2032338-2032740 alt_region_157
5   +   4695084-4695628 alt_region_9316
1   +   22294677-22295134   alt_region_2424
1   +   22294677-22295139   alt_region_2425
1   +   22294677-22295139   alt_region_2426
1   +   22294677-22295139   alt_region_2427
1   +   22294677-22295134   alt_region_2422
1   +   22294677-22295134   alt_region_2423
1   +   22294384-22295198   alt_region_2428
1   +   22294384-22295198   alt_region_2429
5   +   20845105-20845211   alt_region_9784
5   +   20845105-20845206   alt_region_9783
3   +   2651447-2651889 alt_region_5562

EDIT: Thanks to everyone who commented. Sorry if the question was poorly phrased. I appreciate the help in clarifying the issue (or, apparently, non-issue).

Was it helpful?

Solution 2

In some text editors, tabs are displayed like that. The contents of the file are correct, it's just a matter of how the file is displayed on screen. It happens with tabs but not with | which is why you don't see it happening when you use |.

OTHER TIPS

There are no spaces (' ')in your output, only tabs ('\t').

>>> print(repr('1   +   21214720-21214747   alt_region_2377'))
'1\t+\t21214720-21214747\talt_region_2377'
  ^^ ^^                 ^^

Tabs are not equivalent to a fixed number of spaces (in most editors). Rather, they move the character following the tab to the next available multiple of x characters from the left margin, where x varies - x is most commonly 8, though it is 4 here on SO.

>>> for i in range(7):
    print('x'*i+'\tx')


    x
x   x
xx  x
xxx x
xxxx    x
xxxxx   x
xxxxxx  x

If you want your output to appear aligned to the naked eye, you should use string formatting:

>>> for line in data:
    print('{:4} {:4} {:20} {:20}'.format(*line))


5    +    5752624-5752673      alt_region_8161     
1    +    621461-622139        alt_region_67       
1    +    453907-454063        alt_region_60       
1    +    539611-539815        alt_region_61       
4    +    14610049-14610103    alt_region_6893     
4    +    14610049-14610144    alt_region_6895     
4    +    14610049-14610144    alt_region_6897     
4    +    14610049-14610144    alt_region_6896   

Note, however, that this will not necessarily be readable by code that expects a tab-separated value file.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top