Python 2 - iterating through csv with determinating specific lines as dicitonary

Question 1

Consider not using CSV

First of all, your overall strategy to the data problem is probably not optimal. The less tabular your data looks, the less sense it makes to keep it in a CSV file (though your needs aren't too far out of the realm).

For example, it would be really easy to solve this problem using json:

import json

# First the data
data = dict(dict1=dict(key1="value1", key2="value2"),
            dict2=dict(key3="value3", key4="value4"))

# Convert and write
js = json.dumps(data)
f = file("data.json", 'w')
f.write(js)
f.close()

# Now read back
f = file("data.json", 'r')
data = json.load(f)
print data

Answering the question as written

However, if you are really set on this strategy, you can do something along the lines suggested by jonrsharpe. You can't just use the csv module to do all the work for you, but actually have to go through and filter out (and split by) the "//" lines.

import csv
import re

def header_matcher(line):
    "Returns something truthy if the line looks like a dict separator"
    return re.match("//", line)


# Open the file and ...
f = open("data.csv")
# create some containers we can populate as we iterate
data = []
d = {}

for line in f:
    if not header_matcher(line):
        # We have a non-header row, so we make a new entry in our draft dictionary
        key, val = line.strip().split(',')
        d[key] = val
    else:
        # We've hit a new header, so we should throw our draft dictionary in our data list
        if d:
            # ... but only if we actually have had data since the last header
            data.append(d)
            d = {}
# The very last chunk will need to be captured as well
if d:
    data.append(d)

# And we're done...
print data

This is quite a bit messier, and if there is any chance of needed to escape commas, it will get messier still. If you needed, you could probably find a clever way of chunking up the file into generators that you read with CSV readers, but it won't be particularly clean/easy (I started an approach like this but it looked like pain...). This is all a testament to your approach likely being the wrong way to store this data.

An alternative if you're set on CSV

Another way to go if you really want CSV but aren't stuck on the exact data format you specify: Add a column in the CSV file corresponding to the dictionary the data should go into. Imagine a file (data2.csv) that looks like this:

dict1,key1,value1
dict1,key2,value2
dict2,key3,value3
dict2,key4,value4

Now we can do something cleaner, like the following:

import csv

data = dict()
for chunk, key, val in csv.reader(file('test2.csv')):
    try:
        # If we already have a dict for the given chunk id, this should add the key/value pair
        data[chunk][key] = val
    except KeyError:
        # Otherwise, we catch the exception and add a fresh dictionary with the key/value pair
        data[chunk] = {key: val}

print data

Much nicer...

The only good argument for doing something closer to what you have in mind over this is if there is LOTS of data, and space is a concern. But that is not very likely to be case in most situations.

And pandas

Oh yes... one more possible solution is pandas. I haven't used it much yet, so I'm not as much help, but there is something along the lines of a group_by function it provides, which would let you group by the first column if you end up structuring the data as in the the 3-column CSV approach.

Question 2

I decided to use json instead

Reading this is easier for the program and there's no need to filter text. For generating the data inside database in external file.json will serve python program.

#! /usr/bin/python

import json

category1 = {"server name1":"ip address1","server name2":"ip address2"}

category2 = {"server name1":"ip address1","server name1":"ip address1"}

servers = { "category Alias1":category1,"category Alias2":category2}

js = json.dumps(servers)
f = file("servers.json", "w")
f.write(js)
f.close()

# Now read back
f = file("servers.json", "r")
data = json.load(f)
print data

So the output is dictionary containing keys for categories and as values are another dictionaries. Exactly as i wanted.