Question

High-level picture of my program

  • purpose: parse an XML file and save text into similar python objects
  • problem: Every time I create a new python object and append it to a list, instead of creating a new object it seems to append a reference to the previous objects.

Summary of what my intended structure should be:

list of applications that each contains a list of connections

app1: 
     connection1
     connection2
app2:
     connection3
     connection4
     connection5

so thats a summary of what it should do... so here is my main function:

def main(self):
    root = get_xml_root()
    root.get_applications()
    for application in root.applications:
        application.get_connections()           ## this is where the memory goes bad!!!
        for connection in application.connections:
              connection.do_something()

How I know there is a memory problem:

  • When I change one thing in one list of connections that belong to a particular application, the connections in another application will also change.
  • I printed out the memory locations for the connections and found that there are duplicate references (see memory prints)

Memory print outs

  • when I printed out application locations I got the following (its not pretty, but you can see that at least the addresses are different):

generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a07e8 - memory location = 22677480 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a0758 - memory location = 22677336 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a0830 - memory location = 22677552 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a0878 - memory location = 22677624 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a08c0 - memory location = 22677696 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a0908 - memory location = 22677768 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a0950 - memory location = 22677840 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a0998 - memory location = 22677912 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a09e0 - memory location = 22677984 generator_libraries.data_extraction.extraction.Application_XML instance at 0x15a0a28 - memory location = 22678056

  • when I printed out connection locations for 3 different applications I got the following (you can see the duplication among addresses):

    • app1::
    • memory location = 22721168
    • memory location = 22721240
    • memory location = 22721024
    • memory location = 22721600
    • memory location = 22721672

    • app2:

    • memory location = 22721240
    • memory location = 22721672
    • memory location = 22721600
    • memory location = 22721168
    • memory location = 22722104
    • memory location = 22722176

conclusions from memory analysis It seems that every time I create a new connection object and append it to my "connections" list, instead of creating a new object, it takes the memory reference from my previous objects.

A more detailed view of the problematic function's code

class Application_XML(XML_Element_Class):
    name = None
    connections=copy.deepcopy([])
    xml_element=None
    def get_connections(self):
        xml_connections = self.get_xml_children()
        for xml_connection in xml_connections:
            connection = None       ## reset the connection variable
            connection = Connection_XML(xml_connection)
            connection.unique_ID = connection_count
            self.connections.append(copy.deepcopy(connection))
            del connection      ## reset where its pointing to
            connection_count+=1
        self.log_debugging_info_on_connection_memory()   ### this is where I look at memory locations

A class that does the same thing... but works

class Root_XML(XML_Element_Class):
    applications = copy.deepcopy([])
    def get_applications(self):
        xml_applications = self.get_xml_children()
        for xml_application in xml_applications:
            self.applications.append(Application_XML(xml_application))
        self.log_application_memory_information()

If it is any help, here is the connection class:

class Connection_XML(XML_Element_Class):
    ### members
    name = None
    type = None
    ID = None
    max_size = None
    queue_size = None
    direction = None
    def do_something(self):
        pass

Final Words

I have tried nearly every trick in the book in terms of alternate ways of creating the objects, destroying them after I make them... but nothing has helped yet. I feel that there may be an essential python memory concept behind the answer... but after all my searching online, nothing has shed any light onto the answer.

Please, if you can help that would be awesome!!! Thanks :)

Was it helpful?

Solution

I don't think the problem has anything to do with the part you're looking at, but rather with the Connection_XML class:

class Connection_XML(XML_Element_Class):
    ### members
    name = None
    type = None
    ID = None
    max_size = None
    queue_size = None
    direction = None
    def do_something(self):
        pass

All of those members are class attributes. There's a single name shared by every Connection_XML instance, a single type, etc. So, even if your instances are all unique objects, changing one changes all of them.

You want instance attributes—a separate name, etc., for each instance. The way you do that is to just create the attributes dynamically, usually in the __init__ method:

class Connection_XML(XML_Element_Class):
    def __init__(self):
        self.name = None
        self.type = None
        self.ID = None
        self.max_size = None
        self.queue_size = None
        self.direction = None
    def do_something(self):
        pass

It's hard to be sure this is your problem without a real SSCCE. In this toy example, all of the attributes have the value None, which is immutable, so it won't really lead to these kinds of problems. But if one of them is, say, a list, or an object that has its own attributes, it will.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top