Question

I need to suck data from stdin and create a object.

The incoming data is between 5 and 10 lines long. Each line has a process number and either an IP address or a hash. For example:

pid=123 ip=192.168.0.1 - some data
pid=123 hash=ABCDEF0123 - more data
hash=ABCDEF123 - More data
ip=192.168.0.1 - even more data

I need to put this data into a class like:

class MyData():
  pid = None
  hash = None
  ip = None
  lines = []

I need to be able to look up the object by IP, HASH, or PID.

The tough part is that there are multiple streams of data intermixed coming from stdin. (There could be hundreds or thousands of processes writing data at the same time.)

I have regular expressions pulling out the PID, IP, and HASH that I need, but how can I access the object by any of those values?

My thought was to do something like this:

myarray = {}

for each line in sys.stdin.readlines():
  if pid and ip:  #If we can get a PID out of the line
     myarray[pid] = MyData().pid = pid #Create a new MyData object, assign the PID, and stick it in myarray accessible by PID.
     myarray[pid].ip = ip #Add the IP address to the new object
     myarray[pid].lines.append(data) #Append the data

     myarray[ip] = myarray[pid] #Take the object by PID and create a key from the IP.
  <snip>do something similar for pid and hash, hash and ip, etc...</snip>

This gives my an array with two keys (a PID and an IP) and they both point to the same object. But on the next iteration of the loop, if I find (for example) an IP and HASH and do:

myarray[hash] = myarray[ip]

The following is False:

myarray[hash] == myarray[ip]

Hopefully that was clear. I hate to admit that waaay back in the VB days, I remember being able handle objects byref instead of byval. Is there something similar in Python? Or am I just approaching this wrong?

Was it helpful?

Solution

Make two separate dicts (and don't call them arrays!), byip and byhash -- why do you need to smoosh everything together and risk conflicts?!

BTW, you't possibly can have the following two lines back to back:

myarray[hash] = myarray[ip]
assert not(myarray[hash] == myarray[ip])

To make the assert hold you must be doing something else in between (perturbing the misnamed myarray).

BTW squared, assignments in Python are always by reference to an object -- if you want a copy you must explicitly ask for one.

OTHER TIPS

Python only has references.

Create the object once, and add it to all relevant keys at once.

class MyData(object):
  def __init__(self, pid, ip, hash):
    self.pid = pid
     ...

for line in sys.stdin:
  pid, ip, hash = process(line)
  obj = MyData(pid=pid, ip=ip, hash=hash)
  if pid:
    mydict[pid] = obj
  if ip:
    mydict[ip] = obj
  if hash:
    mydict[hash] = obj
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top