Automatically take list of terms, import into Windows search function (for content), and export lists of results. (AutoIT?)

StackOverflow https://stackoverflow.com/questions/13954785

Question

My next big challenge is to write a script (I assume it would be in AutoIT, an area I have little experience with) to automate the Windows search function.

The end goal is to take a list of search terms from a .txt file (one string per line), and search the contents of every document on the computer for said search terms (one at a time).

I can make this happen by hand - turn on the search by content function, index all files on all attached drives, search the terms one by one, and highlight all > shift-click > Copy as path > paste in notepad, and save as [searchterm].txt.

However, I need to automate that whole process. I understand that I might need to write a separate script for each version of Windows it would be used with (XP, Vista, 7, 8).

Is this an easy enough task to accomplish, or would it take a lot of programming hours? Can anyone point me in the right direction? All help is appreciated.

Was it helpful?

Solution

Well, assuming your text file of queries is large enough, and you don't want to actually iterate the entire file system for each, you are describing a classic information retrieval problem.

  1. Index the data from your file system (this is a preprocessing that is done only once)
  2. For each query - search for it in the index, and get the relevant documents.

The field of Information Retrieval is a huge area of research, and I really don't encourage you to try implementing it from scratch.

I do encourage using built in libraries that are already developed and tested for you that do it. For example, in java a popular choice is lucene - which is very widely used for searching everywhere.

If you are not familiar with java, I am also aware of python (pylucene) and .NET (lucene.NET) bindings of this library.


To learn more about Information Retrieval I recommend Manning's Introduction to Information Retrieval

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top