Question

I am creating a program that I will use to help my customers recover passwords placed on office documents like word and excel. The program works just fine but it is MUCH slower than similar products that you can download for free. I would like to use my own program because I feel like a lot of the ones you download for free aren't completely safe and lack some of the controls that I would like to have.

More to the point... I need help figuring out why my program is so much slower. I created an excel document with a simple 3 letter password "TFX". The program I downloaded find the password almost as fast as I can let go of the mouse button after clicking on 'go'. My program takes 10 minutes. Here's the 3 character loop:

  private string ThreeCharPass(string file, Microsoft.Office.Interop.Excel.Application exApp, char[] combarr)
    {
        for (int three = 0; three < combarr.Length; three++)
        {
            for (int two = 0; two < combarr.Length; two++)
            {
                for (int one = 0; one < combarr.Length; one++)
                {
                    try
                    {
                        string pass = combarr[three].ToString() + combarr[two].ToString() + combarr[one].ToString();
                        exApp.Workbooks.Open(file, false, true, Type.Missing, pass, Type.Missing, true, Type.Missing, Type.Missing, false, false, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
                        return pass;
                    }
                    catch
                    {
                    }
                }
            }
        }
        return string.Empty;
    }

The array 'combarr' is an array of chars containing all the possible characters in the password. It's generated earlier in the program based on user selected options. I'm thinking the issue has to be in the way I'm looping through the array to create the password combinations because just in this 3 character password method it spends more than 5 minutes where other 'professional' programs spend seconds. Any feedback would be greatly appreciated!!

Was it helpful?

Solution

There are some minor optimizations that you could do on your code, but the most likely culprit is the exApp.Workbooks.Open call. I think that call is very slow, but you could test that with a profiler.

What other tools do is to read the actual document structure (DOC, DOCX format) and figure out whether the password is correct the exact same way that Word would try to figure it out. I don't know the exact details, but it is very likely that there is a way by which Word knows that the password was correct. An example could be a unique string that, when decrypted correctly, has the expected value; or a checksum that adds up. When you know the specification of the format, you can do that test yourself, saving you the expensive interop call.

This page has detailed information about many of the Microsoft Office formats. It is a lot of work to implement parts of those specifications, but it will surely speed things up. Only once you've removed the interop call, you could take a look at more efficient loops, multithreading, and other strategies.

Note that the Office formats are proprietary, so not all information may be available, complete, up-to-date or reliable.

OTHER TIPS

In case of the .docx documents there is a much easier way.

  1. Rename the extension to a .zip,
  2. Go into the file, there you find the folder "word"
  3. Here you delete the "settings.xml"
  4. Now close the folder, rename the file back to .docx.

Finished. Found on http://www.nextofwindows.com/how-to-remove-password-from-protected-word-file-in-word-2007-and-2010

For Excel there is a very similar method, but it involves a bit more work

  1. Rename the extension to a .zip,
  2. Go into the file, there you find the folder "xl" and under that the folder "worksheets"
  3. There should be one or more files named like: sheet1.xml (sheet2.xml, etc). Inside of one of those files is an XML tag: < sheetProtection password=… />. Delete that entire XML tag.
  4. Now close the folder, rename the file back to .xlsx or .xlsm if it is a macro-enabled workbook.

Found on http://blog.bitcollectors.com/adam/2011/10/how-to-unprotect-a-password-protected-xlsx-file/

Alternatively the following method worked fine for me.. http://www.instructables.com/id/VBA-Code-To-Unlock-A-Locked-Excel-Sheet/

Good luck!

Completely the wrong approach.

  1. Multi-threading
  2. Unless you know the specific password length and that it is random/secure (few users choose truly random passwords), you'd likely do better with a dictionary attack and/or rainbow tables
  3. With something as common as Excel, I'd first look for known info on the algorithm/weaknesses in order to determine things like password length and how Excel actually decrypts/unlocks the file.
  4. Using the API directly to test your password is likely going to be slow. It would be better to find a known value in a certain position and work on that. If with #3 you can determine how Excel decrypts/unlocks the file, you can likely replicate this in your own code to run about as fast as the interop overhead by itself if you were to call the Excel method.

That being said, should you really be doing this?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top