Automatically take list of terms, import into Windows search function (for content), and export lists of results. (AutoIT?)

StackOverflow https://stackoverflow.com/questions/13954785

Domanda

My next big challenge is to write a script (I assume it would be in AutoIT, an area I have little experience with) to automate the Windows search function.

The end goal is to take a list of search terms from a .txt file (one string per line), and search the contents of every document on the computer for said search terms (one at a time).

I can make this happen by hand - turn on the search by content function, index all files on all attached drives, search the terms one by one, and highlight all > shift-click > Copy as path > paste in notepad, and save as [searchterm].txt.

However, I need to automate that whole process. I understand that I might need to write a separate script for each version of Windows it would be used with (XP, Vista, 7, 8).

Is this an easy enough task to accomplish, or would it take a lot of programming hours? Can anyone point me in the right direction? All help is appreciated.

È stato utile?

Soluzione

Well, assuming your text file of queries is large enough, and you don't want to actually iterate the entire file system for each, you are describing a classic information retrieval problem.

  1. Index the data from your file system (this is a preprocessing that is done only once)
  2. For each query - search for it in the index, and get the relevant documents.

The field of Information Retrieval is a huge area of research, and I really don't encourage you to try implementing it from scratch.

I do encourage using built in libraries that are already developed and tested for you that do it. For example, in java a popular choice is lucene - which is very widely used for searching everywhere.

If you are not familiar with java, I am also aware of python (pylucene) and .NET (lucene.NET) bindings of this library.


To learn more about Information Retrieval I recommend Manning's Introduction to Information Retrieval

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top