I have settled on using pyPdf. It has a simple method that just extracts the text from the pdf. I have written simple functions to find the relevant information I need in this text. Splitting the text into list for easy data identification.
Have also written a loop to to pick up the relevant files using glob search and feeding it into the parser.
import pyPdf
pdf = pyPdf.PdfFileReader(open(filename, "rb"))
data = ''
for page in pdf.pages:
data += page.extractText()
data2 = data.split('\n')