Question

I'm trying to create an application that searches through files, much like WindowsXP has. I'm using 4 threads that search through the specified directories and open every file to search for a string. This is done by calling a static method from a static class. The method then tries to find out the extension, and runs it through a private method depending on what extension is found. I've only created the possibility to read plain text files to the class. Here is the code:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;

namespace Searcher
{
    static public class Searching 
    {
        static public bool Query(string file, string q)
        {
            file = file.ToLower();

            if (file.EndsWith(".txt")) // plain textfiles
            {
                return txt(file, q);
            } // #####################################
            else if (file.EndsWith(".doc"))
            {
                return false;
            } // #####################################
            else if (file.EndsWith(".dll")) // Ignore these
            {
                return false;
            }
            else if (file.EndsWith(".exe")) // Ignore these
            {
                return false;
            }
            else // will try reading as a textfile
            {
                return txt(file, q);
            }
        }

        static private bool txt(string file, string q)
        {
            string contents;
            TextReader read = new StreamReader(file);
            contents = read.ReadToEnd();
            read.Dispose();
            read.Close();

            return contents.ToLower().Contains(q);
        }

        static private bool docx(string file, string q)
        {
            return false;
        }
    }
}

Query reads the extension, and then forwards the processing. As I only included plain text files at this moment, not much can be chosen. Before the search begins I also tell my program that it needs to read all files possible.

Now my problem lies here, though the reader can only read plain text files, it also reads images and applications (.exe/.dll). This is expected as it tries to read everything. The weird thing though is that it returns with a match. I've searched the files in Notepad++ but there were no matches. I also pulled out the content by using breakpoints right after the file is read into the 'contents'-variable, and tried to search that, but again without a match. So this would mean that the content is not searched very well by the String.Contains() method, which seems to believe that the given query is in the file.

I hope someone knows what the problem could be. The string I searched for was "test", and the program works when searching textfiles.

Was it helpful?

Solution

Glad you found a solution.

I'd still like to see some of the offending "false positive" files to be able to have a look.

In the meanwhile, and a bit of a tangent, but still relevant, I'd change your txt function to :

private bool txt(string file, string q)
{
    string contents = "";
    using (TextReader read = new StreamReader(file))
    {
        contents = read.ReadToEnd();
    }

    return contents.ToLower().Contains(q);
}

Cleaner that way.

Edit :
Well, the reason they return true is because those files do contain the string "Test" in them, Specifically: [CCP_TEST RMCCPSearchValidateProductIDSetODBCFoldersAllocateRegistrySpaceNOT] in the MSI and [OnUpda teSt ring] in the dll. So, the String.Contains() is working properly.

So, back to filtering what you're searching for. Either give a list of known text endings, or let the user choose what he wants.

Some other things you might want to consider is only searching for exact words, so test won't be true in the case of OnUpdateString :)

Text extensions: on wiki , on fileinfo

OTHER TIPS

I tried for a .Dll and exe file , It worked fine for me. You are getting true because the value you are searching is present in the file. Try opening the file with notepad and search for the value.

also try searching for some other string like "eafrd" instead of test(which is a dictionary word which can be present in dll or exe files).It returned me false.

also see for any random word in the file which you opened in the notepad try searching for it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top