MongoDB query: how to check if a string stored in a document is contained in another string

StackOverflow https://stackoverflow.com/questions/7805787

  •  25-10-2019
  •  | 
  •  

Question

I have a collection with 8k+ strings and I need to check if a particular string is contained in another string. For example:

StringInDb = "this is a string"
TheOtherString = "this is a long string that contains this is a string"

with linq I used something like:

from s in DB.Table 
where TheOtherString.IndexOf(s.StringInDb ) > -1
select s.StringInDb;

How can I do this (efficiently) in mongodb (even better using the c# .net driver)?

Was it helpful?

Solution

To me this sounds like you need to use map/reduce: map out all your strings from the DB and reduce to the ones contained in your long string. Cant remember the C# off the top of my head. Can find it later if you want.

Update: The native language of MongoDB is JavaScript and Map/Reduce is run "inside the mongodb engine", which implies that the map and reduce function must be JavaScript, not C#. They can be called from C# though, as illustrated by this example taken from the official MogoDB C# driver documentation (http://www.mongodb.org/display/DOCS/CSharp+Driver+Tutorial#CSharpDriverTutorial-MapReducemethod). The example counts how many times each key is found in a collection:

var map =
  "function() {" +
  "    for (var key in this) {" +
  "        emit(key, { count : 1 });" +
  "    }" +
  "}";

var reduce =
  "function(key, emits) {" +
  "    total = 0;" +
  "    for (var i in emits) {" +
  "        total += emits[i].count;" +
  "    }" +
  "    return { count : total };" +
  "}";

var mr = collection.MapReduce(map, reduce);
foreach (var document in mr.GetResults()) {
  Console.WriteLine(document.ToJson());
}

OTHER TIPS

In mongodb for contains you need to user regexp, so c# query will be following:

var query = Query.Matches("StringParamName", 
     BsonRegularExpression.Create(".*this is a string.*", "-i"));

After you have done with query building, put this query into Collection.FindAs<DocType>(query) method.

-i - means ignore case

Regexp in mongodb work slow, because it can't use index. But for 8k collection it should work pretty quick.

This a wrapper used in my production system. When you should always call GetBsonValue() and it would do the rest of the work for you

/// <summary>
/// Makes a Bson object for current value object
/// </summary>
/// <returns>The Bson object for current value object</returns>
private BsonValue GetBsonValue()
{
    if (!_value.Contains(_wildCard))
        return _value;
    string pattern = ApplyWildCard();
    return BsonRegularExpression.Create(new Regex(pattern, RegexOptions.IgnoreCase));
}

/// <summary>
/// Finds wildcard characters and substitutes them in the value  string
/// </summary>
/// <returns></returns>
private string ApplyWildCard()
{
    return string.Format("^{0}$", _value.Replace(_wildCard, ".*"));
}

From outside you call the next method, so there is not possibility you would forget:

public QueryComplete BuildQuery()
    {
        return Query.EQ(_key, GetBsonValue());
    }
"$where" : "\"

This is a long string that contains this is a string\".match(this.YourFieldName)"

Is this you want?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top