Question

I have a big set of .xls (Excel 97-2003 Workbook) files. A few of them contain VBA macros inside, I would like to find a way to filter them out automatically without opening them one by one in MS Excel.

There is an old post which is similar to my question, I have downloaded the OLE Doc viewer, but cannot open the .zip file, it seems that it is out of date...

Does anyone know if there is any API or tool to check if an .xls file contains VBA macros without opening it in MS Excel? In the first place, I don't bother to know the content of the macros.

PS: for a .xlsx or .xlsm file, we can change their file extension to .zip which contain some .xml files and eventually vbaProject.bin for VBA macros. However, this approach does not work for .xls file, renaming it does not make a valid .zipfile.

Was it helpful?

Solution

Here is a simple Python 2.7 solution I've cooked for you:

It depends only on the OleFileIO_PL module which is availble from the project page The good thing with OleFile parser is that it is not aware of the "excel-specific" contents of the file ; it only knows the higher level "OLE storage". So it is quick in analyzing the file, and there is no risk that a potentially harmful macro would execute.

import OleFileIO_PL
import argparse

if __name__=='__main__':
    parser = argparse.ArgumentParser(description='Determine if an Excel 97-2007 file contains macros.', epilog='Will exit successfully (0) only if provided file is a valid excel file containing macros.')
    parser.add_argument('filename', help="file name to analyse")
    args=parser.parse_args()

    # Test if a file is an OLE container:
    if (not OleFileIO_PL.isOleFile(args.filename)):
        exit("This document is not a valid OLE document.")

    # Open an OLE file:
    ole = OleFileIO_PL.OleFileIO(args.filename)

    # Test if known streams/storages exist:
    if (not ole.exists('workbook')):
        exit("This document is not a valid Excel 97-2007 document.")

    # Test if VBA specific streams exist:
    if (not ole.exists('_VBA_PROJECT_CUR')):
        exit("This document does not contain VBA macros.")

    print("Valid Excel 97-2007 workbook WITH macros")

I tested it on a couple of files with success. Let me know if it's suitable for you

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top