Python help understanding sample code

https://stackoverflow.com/questions/10589933

08-06-2021
|

Question

I've been trying to learn python recently, and ran across something that I'm having a hard time understanding exactly how it works. Primarily, it is the design of a list.

The list in question is from this security article talking about a simple fuzzing tool: http://blog.securestate.com/post/2009/10/06/How-a-simple-python-fuzzer-brought-down-SMBv2-in-2-seconds.aspx

The actual list in question is:

#Negotiate Protocol Request
packet = [chr(int(a, 16)) for a in """
00 00 00 90
ff 53 4d 42 72 00 00 00 00 18 53 c8 00 00 00 00
00 00 00 00 00 00 00 00 ff ff ff fe 00 00 00 00
00 6d 00 02 50 43 20 4e 45 54 57 4f 52 4b 20 50
52 4f 47 52 41 4d 20 31 2e 30 00 02 4c 41 4e 4d
41 4e 31 2e 30 00 02 57 69 6e 64 6f 77 73 20 66
6f 72 20 57 6f 72 6b 67 72 6f 75 70 73 20 33 2e
31 61 00 02 4c 4d 31 2e 32 58 30 30 32 00 02 4c
41 4e 4d 41 4e 32 2e 31 00 02 4e 54 20 4c 4d 20
30 2e 31 32 00 02 53 4d 42 20 32 2e 30 30 32 00
""".split()]

He pulls a single byte (I think?) from it using the following lines:

what = packet[:]
where = choice(range(len(packet)))
which = chr(choice(range(256)))
what[where] = which

I have never seen a list designed this way, and can't seem to follow how it is selecting whatever it does. What is confusing me most is the packet = [chr(int(a, 16)) for a in """, where he houses all of that stuff in what appears to be a comment block... then does .split(). 0_o

I know this is a vague question, but if anyone could either explain this to me or point me in the direction of some documentation that explains that style of list building I'd be exceptionally happy. This looks like a very efficient way to store/pull out a large number of bytes.

Solution

Let's break it down, and simplify it for readability:

    bytes = """
            00 00 00 90
            ff 53 4d 42 72 00 00 00 00 18 53 c8 00 00 00 00
            00 00 00 00 00 00 00 00 ff ff ff fe 00 00 00 00
            00 6d 00 02 50 43 20 4e 45 54 57 4f 52 4b 20 50
            52 4f 47 52 41 4d 20 31 2e 30 00 02 4c 41 4e 4d
            41 4e 31 2e 30 00 02 57 69 6e 64 6f 77 73 20 66
            6f 72 20 57 6f 72 6b 67 72 6f 75 70 73 20 33 2e
            31 61 00 02 4c 4d 31 2e 32 58 30 30 32 00 02 4c
            41 4e 4d 41 4e 32 2e 31 00 02 4e 54 20 4c 4d 20
            30 2e 31 32 00 02 53 4d 42 20 32 2e 30 30 32 00
            """
    packet = [chr(int(a, 16)) for a in bytes.split()]

bytes is a string, the """ is usually used for Python docstrings, but you can use them in code to create very long strings (but they kind of suck because you will end up with extra spaces in your code.

bytes.split() will split on white space, and return a list of the individual parts of the string that were space-separated.

print bytes.split()

['00', '00', '00', '90', 'ff', '53', '4d', '42', '72', 
 '00', '00', '00', '00', '18', '53', 'c8', '00', '00' ... ] # and more

So then this:

packet = [chr(int(a, 16)) for a in bytes.split()]

This is a list comprehension:

split bytes and get that list as above
for each element in the list (a here), perform int(a,16) on it, which will get its integer value by doing base-16 to decimal conversion (i.e. FF would be 255).
Then do chr on that value, which will give you back the ASCII value of that byte.

So packet will be a list of the bytes in ASCII form.

print packet
['\x00', '\x00', '\x00', '\x90', '\xff', 'S', 'M', 'B', 'r', '\x00', '\x00', '\x00',
 '\x00', '\x18', 'S', '\xc8', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', '\x00', 
 '\x00', '\x00', '\x00', '\x00', '\x00', '\xff', '\xff', '\xff', '\xfe', '\x00', 
 '\x00', '\x00', '\x00', '\x00', 'm', '\x00', '\x02', 'P', 'C', ' ', 'N', 'E', 'T', 
 'W', 'O', 'R', 'K', ' ', 'P', 'R', 'O', 'G', 'R', 'A', 'M', ' ', '1', '.', '0', 
 '\x00', '\x02', 'L', 'A', 'N', 'M', 'A', 'N', '1', '.', '0', '\x00', '\x02', 'W', 'i', 
 'n', 'd', 'o', 'w', 's', ' ', 'f', 'o', 'r', ' ', 'W', 'o', 'r', 'k', 'g', 'r', 'o', 
 ... more ]

OTHER TIPS

This

"""
00 00 00 90
ff 53 4d 42 72 00 00 00 00 18 53 c8 00 00 00 00
00 00 00 00 00 00 00 00 ff ff ff fe 00 00 00 00
00 6d 00 02 50 43 20 4e 45 54 57 4f 52 4b 20 50
52 4f 47 52 41 4d 20 31 2e 30 00 02 4c 41 4e 4d
41 4e 31 2e 30 00 02 57 69 6e 64 6f 77 73 20 66
6f 72 20 57 6f 72 6b 67 72 6f 75 70 73 20 33 2e
31 61 00 02 4c 4d 31 2e 32 58 30 30 32 00 02 4c
41 4e 4d 41 4e 32 2e 31 00 02 4e 54 20 4c 4d 20
30 2e 31 32 00 02 53 4d 42 20 32 2e 30 30 32 00
"""

is just multiline string.

"""
00 00 00 90
ff 53 4d 42 72 00 00 00 00 18 53 c8 00 00 00 00
""".split()

produces split with spaces of the abovementioned string:

['00', '00', '00', '90', 'ff', '53', '4d', '42', '72', '00', '00', '00', '00', '18', '53', 'c8', '00', '00', '00', '00']

And this:

[chr(int(a, 16)) for a in ['00', '00', '00', '90', 'ff', '53', '4d', '42', '72', '00', '00', '00', '00', '18', '53', 'c8', '00', '00', '00', '00']]

is a list comprehension which goes through the formed list and converts all the values applying chr(int(a,16)) to each a.

int(a,16) converts string containing string representation of hexadecimal into int.

chr converts this integer into char.

The result is:

>>> [chr(int(a, 16)) for a in ['00', '00', '00', '90', 'ff', '53', '4d', '42', '72', '00', '00', '00', '00', '18', '53', 'c8', '00', '00', '00', '00']]
['\x00', '\x00', '\x00', '\x90', '\xff', 'S', 'M', 'B', 'r', '\x00', '\x00', '\x00', '\x00', '\x18', 'S', '\xc8', '\x00', '\x00', '\x00', '\x00']

The

   """
content
"""

format is a simple way to define multiline string literals in python. This is not a comment block.

The [chr(int(a, 16)) for a in "00 00 00...".split()] is a list comprehension. The large string is split into an array (split by spaces), and for each item in the array, it converts it to a hexadecimal number (int(a,16) means turn string a into an int, string a is in base 16) and then returns that ascii char (chr(...)) represented by that integer.

packet[:] returns a shallow copy of the list packet.

choice(range(len(packet))) returns a random number in the range of the length of packet.

chr(choice(range(256))) picks a random number in the range 0,255 and interprets it as an ascii char, and then the final statement inserts that ascii char into the randomly selected location.

You're running into a couple different concepts here. Just slowly work backwards and you'll figure it out.

The """00 00 00 90 ff 53 4d 42 72 00 00 00 00 18 53 c8 00 00 00 00""" stuff is just a big string. The .split on it breaks it into an array on the spaces, so at that point you have something like ['00', '00', '00', '90' ....]

The rest of that line is a list comprehension -- its a fancy way of doing this:

new_list = []
for a in that_list_we_split_above:
    new_list.append( chr( int(a, 16) ) )

the int function is converting the string to an int in base 16 - http://docs.python.org/library/functions.html#int

the chr function is then getting the ascii character using that number

so at the end of all that nonsense you have a list 'packet'

the line defining where takes the length of that list, creates a new list with every number from 0 to the length (ie, every possible index of that), and randomly selects one of them.

the line for which picks a random int between 0 and 256 and gets the ascii character for it

the last line replaces the item in the packets list at the 'where' index with the random ascii character defined in which

tl;dr: go find different code to learn on - this is both confusing and uninspired

The code sample in question seems to substitute a randomly chosen byte in the original packet for another random byte (which I believe, is one of the ideas behind fuzzing.)

packet = [chr(int(a, 16)) for a in """
 00 00 00 90 .... """.split()]

This is "split the string on whitespace, read the substrings as characters decoded from integers in hex (the second argument to int is the base).

what = packet[:]

Python idiom for "copy the packet array into what".

where = choice(range(len(packet)))

Choose a random index in the packet.

which = chr(choice(range(256)))

Make a random character.

what[where] = which

Substitute it at the previously chosen index.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow