Question

I am trying to make some .bed files for genetic analysis. I am a python beginner. The files I want to make should be 3 columns, tab seperated, first column always the same (chromosome number) and 2nd and 3rd columns windows of size 200 starting at zero and ending at end of chromosome. Eg:

chr20 0 200
chr20 200 400
chr20 400 600
chr20 600 800
...

I have the size of the chromosome so at the moment I am trying to say 'while column 2 < (size of chrom) print line. I have a skeleton of a script but it is not quite working, due to my lack of experience. Here is what I have so far:

output = open('/homw/genotyping/wholegenome/Chr20.bed', 'rw') 

column2 = 0
column1 = 0
while column2 < 55268282:
    for line in output:
        column1 = column1 + 0
        column2 = column2 + 100

        print output >> "chr20" + '\t' + str(column1) + '\t' + str(column2)

If anyone can fix this simple script so that it does as I described, or writes a better solution that would be really appreciated. I considered making a script that could output all files for 20 chromosomes and chrX but as I need to specify the size of the chromosome I think I'll have to do each file separately.

Thanks in advance!

Was it helpful?

Solution

How about this:

step = 200 # change values by this amount
with open('Chr20.bed', 'w') as outfp:
   for val in range(0, 1000, step):  #increment by step, max value 1000
      outfp.write('{0}\t{1:d}\t{2:d}\n'.format('chr20', val, val+step))

gives tab delimited output as requested

chr20   0   200
chr20   200 400
chr20   400 600
chr20   600 800
chr20   800 1000

Note: using with will automatically close the file for you when you are done, or an exception is encountered.

This gives more information about the .format() function in case you are curious.

OTHER TIPS

I suggest that you use the numpy.savetxt function to save the data to a text file:

windows = range(0, 55268282, 200)
numpy.savetxt('Chr20.bed', numpy.transpose((windows[:-1], windows[1:])), fmt=('chr20\t%d\t%d'))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top