Question

I'm currently trying to make the input file for a hydrologic model (HBV-light) compatible with external calibration software (PEST). HBV-light requires that it's input files be in XML format, while PEST can only read text files. My issue relates to writing a script that will automatically convert a parameter set written by PEST (in CSV format) to an XML file that can be read by HBV-light.

Here's a short example of a text file that can be written by PEST:

W,X,Y,Z
1,2,3,4

and this is how I'm attempting to organize the XML file:

<Parameters>
   <GroupA>
      <W>1</W>
      <X>2</X>
   </GroupA>
   <GroupB>
      <Y>3</Y>
      <Z>4</Z>
   </GroupB>
</Parameters>

I don't have very much programming experience whatsoever, but here is a python code that I wrote so far:

import csv

csvFile = 'myCSVfile.csv'
xmlFile = 'myXMLfile.xml'

csvData = csv.reader(open(csvFile))
xmlData = open(xmlFile, 'w')
xmlData.write('<?xml version="1.0" encoding="utf-8"?>' + "\n")
# there must be only one top-level tag
xmlData.write('<Catchment xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">' + "\n")
xmlData.write('<CatchmentParamters>' + "\n")
rowNum = 0
for row in csvData:
    if rowNum == 0:
        tags = row
        # replace spaces w/ underscores in tag names
        for i in range(0, 2):
            tags[i] = tags[i].replace(' ', '_')
    else: 
        for i in range(0, 2):
            xmlData.write('    ' + '<' + tags[i] + '>' \
                          + row[i] + '</' + tags[i] + '>' + "\n")

    rowNum +=1

xmlData.write('</CatchmentParameters>' + "\n")
xmlData.write('<VegetationZone>' + "\n")
xmlData.write('<VegetationZoneParameters>' + "\n")
rowNum = 0
for row in csvData:
    if rowNum == 0:
        tags = row
        # replace spaces w/ underscores in tag names
        for i in range(3, 5):
            tags[i] = tags[i].replace(' ', '_')
    else: 
        for i in range(3, 5):
            xmlData.write('    ' + '<' + tags[i] + '>' \
                          + row[i] + '</' + tags[i] + '>' + "\n")

    rowNum +=1

xmlData.write('</VegetationZoneParameters>' + "\n")
xmlData.write('</VegetationZone>' + "\n")
xmlData.write('</Catchment>' + "\n")
xmlData.close()

I can get the Group A (or CathmentParameters specifically) to be written, but the Group B section is NOT being written. Not sure what to do!

Was it helpful?

Solution

I think that the loop is wrong. Try if this works for you

#! /usr/bin/env python
# coding= utf-8

import csv

csvFile = 'myCSVfile.csv'
xmlFile = 'myXMLfile.xml'

csvData = csv.reader(open(csvFile))
xmlData = open(xmlFile, 'w')
xmlData.write('<?xml version="1.0" encoding="utf-8"?>' + "\n")
# there must be only one top-level tag
xmlData.write('<Catchment xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">' + "\n")
xmlData.write('<CatchmentParamters>' + "\n")
rowNum = 0


for row in csvData:
    if rowNum == 0:
        tags = row
        # replace spaces w/ underscores in tag names
        for i in range(0, 2):
            tags[i] = tags[i].replace(' ', '_')

    else: 
      for i in range(0, 2):
            xmlData.write('    ' + '<' + tags[i] + '>' \
                          + row[i] + '</' + tags[i] + '>' + "\n")

      xmlData.write('</CatchmentParameters>' + "\n")
      xmlData.write('<VegetationZone>' + "\n")
      xmlData.write('<VegetationZoneParameters>' + "\n")

      for i in range(2, 4):
            xmlData.write('    ' + '<' + tags[i] + '>' \
                          + row[i] + '</' + tags[i] + '>' + "\n")

      xmlData.write('</VegetationZoneParameters>' + "\n")
      xmlData.write('</VegetationZone>' + "\n")

    rowNum +=1

xmlData.write('</Catchment>' + "\n")
xmlData.close()

OTHER TIPS

I think the issue is in your range definition in the second part... range(3, 5) means elements 4 and 5, what you want is probably range(2,4) meaning elements 3 and 4.

The problem is that you iterate over the contents of the csv file twice - it appears that you need to "rewind" after your first loop. There is also a minor indexing issue, with the second range needing to be range(2,4) and not range(3,5) as was already pointed out.

I created a piece of code that appears to work. It can probably be improved upon by people who understand Python properly. Note - I added a couple of print statements to convince myself I understood what is happening. If you don't open the csvFile a second time (at "starting the second for loop"), then no rows get printed. That's your clue that this is the problem.

import csv

csvFile = 'myCSVfile.csv'
xmlFile = 'myXMLfile.xml'

csvData = csv.reader(open(csvFile))
xmlData = open(xmlFile, 'w')
xmlData.write('<?xml version="1.0" encoding="utf-8"?>' + "\n")
# there must be only one top-level tag
xmlData.write('<Catchment xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">' + "\n")
xmlData.write('<CatchmentParamters>' + "\n")
rowNum = 0
for row in csvData:
    print "row is ", row
    if rowNum == 0:
        tags = row
        # replace spaces w/ underscores in tag names
        for i in range(0, 2):
            tags[i] = tags[i].replace(' ', '_')
    else: 
        for i in range(0, 2):
            xmlData.write('    ' + '<' + tags[i] + '>' \
                          + row[i] + '</' + tags[i] + '>' + "\n")

    rowNum +=1

xmlData.write('</CatchmentParameters>' + "\n")
xmlData.write('<VegetationZone>' + "\n")
xmlData.write('<VegetationZoneParameters>' + "\n")
rowNum = 0
print "starting the second for loop"
csvData = csv.reader(open(csvFile))
for row in csvData:
    print "row is now ", row
    if rowNum == 0:
        tags = row
        # replace spaces w/ underscores in tag names
        for i in range(2, 4):
            tags[i] = tags[i].replace(' ', '_')
    else: 
        for i in range(2, 4):
            xmlData.write('    ' + '<' + tags[i] + '>' \
                          + row[i] + '</' + tags[i] + '>' + "\n")

    rowNum +=1

xmlData.write('</VegetationZoneParameters>' + "\n")
xmlData.write('</VegetationZone>' + "\n")
xmlData.write('</Catchment>' + "\n")
xmlData.close()

Using the above with the little test file you had given resulted in the following XML file:

<?xml version="1.0" encoding="utf-8"?>
<Catchment xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema">
<CatchmentParamters>
    <W>1</W>
    <X>2</X>
</CatchmentParameters>
<VegetationZone>
<VegetationZoneParameters>
    <Y>3</Y>
    <Z>4</Z>
</VegetationZoneParameters>
</VegetationZone>
</Catchment>

Problem solved?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top