Question

I'm writing a class that filters a lot of values. What's the best way to search/remove/replace a string in another string?

For example:

name +value (email)

The email, how to get it using? Linq? or .Split()? or Regular Expressions?

Which would have best performance?

Currently I'm using this:

string[] parts = val.Split('(');

string Email = parts[1].Replace(")", String.Empty);
Was it helpful?

Solution

On my machine, a variation of your code is the fastest (yours comes in second).

NOTE THE UNITS!! Ticks are 100 nanosecond increments.

SplitReplace takes 0.961795 ticks per call
Split takes 0.747009 ticks per call      
Regex takes 2.512739 ticks per call
WithLinq takes 2.59299 ticks per call

My variation is just to only split (no replace):

string[] parts = val.Split('(', ')');
return parts[1];

The testing code...

[Test]
public void SO()
{
    const string input = "name +value (email)";

    TestGivenMethod(input, SplitReplace, "SplitReplace");
    TestGivenMethod(input, JustSplit, "Split");
    TestGivenMethod(input, WithRegex, "Regex");
    TestGivenMethod(input, WithLinq, "WithLinq");
}

private void TestGivenMethod(string input, Func<string, string> method, string name)
{
    Assert.AreEqual("email", method(input));

    var sw = Stopwatch.StartNew();
    string res = "";

    for (int i = 0; i < 1000000; i++)
    {
        var email = method(input);
        res = email;
    }

    sw.Stop();

    Assert.AreEqual("email", res);
    Console.WriteLine("{1} takes {0} ticks per call", sw.ElapsedTicks/1000000.0, name);
}

string SplitReplace(string val)
{
    string[] parts = val.Split('(');
    return parts[1].Replace(")", String.Empty);
}

string JustSplit(string val)
{
    string[] parts = val.Split('(', ')');
    return parts[1];
}

private static Regex method3Regex = new Regex(@"\(([\w@]+)\)");
string WithRegex(string val)
{
    return method3Regex.Match(val).Groups[1].Value;
}

string WithLinq(string val)
{
    return new string(val.SkipWhile(c => c != '(').Skip(1).TakeWhile(c => c != ')').ToArray());
}

OTHER TIPS

I would recommend regular expression as I think it is invented for this reason which is search in a string and string replacement.

If I understand your question correctly, you're trying to replace the literal of (email) with an email likely provided from another source

var text = "name +value (email)";
var emailAddress = "someone@test.com";
text = Regex.Replace(text, @"\(email\)", emailAddress);

The code block above will replace '(email)' with the contents of the emailAddress variable

Be sure to add the appropriate using statement to the top of your code-file

using System.Text.RegularExpressions;

String.Split would be the most simplest and easy to understand approach as compared to Regular Expression and I am not sure How you can fit LINQ here.

As far as performance is concerned it would be best if you can do profiling against your test data to see actual performance difference between Regular Expression and String.Split

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top