Question

I store md5 hashes as binary(16) in a MSSQL Database. I need to insert several thousand values with pymssql, which is very slow if I insert the values one by one with cursor.execute() instead of cursor.executemany(). But the problem is, that i cannot insert binary data, because executemany takes my list of md5 hashes and treat it as a string...

my insert line looks like this:

# generated from hashlib.md5('somestring').hexdigest()
md5list = ['9e107d9d372bb6826bd81d3542a419d6', 'e4d909c290d0fb1ca068ffaddf22cbd0']
# query should look like this: insert into hashes (md5) values (0x9e107d9d372bb6826bd81d3542a419d6)
cursor.executemany("insert into hashes (md5) values (0x%s)", md5list) 
# now we have a query like: insert into hashes (md5) values (0x'9e107d9d372bb6826bd81d3542a419d6')

is there a way to insert the hashes with executemany?

Was it helpful?

Solution

Query parameter for Cursor.executemany should sequence of sequences.

Use `int(.., 16) to convert hexadecimal string to int.

>>> int('9e107d9d372bb6826bd81d3542a419d6', 16)
210103647840849757586127012022035159510L
>>> hex(int('9e107d9d372bb6826bd81d3542a419d6', 16))
'0x9e107d9d372bb6826bd81d3542a419d6L'

How about using list comprehension?

md5list = ['9e107d9d372bb6826bd81d3542a419d6',
           'e4d909c290d0fb1ca068ffaddf22cbd0']
params = [[int(h, 16)] for h in md5list]
cursor.executemany("insert into hashes (md5) values (%s)", params)
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top