Question

I am working on some POS-tagger analysis and i need to replace some tags. I am using a regular expression to identify the tags:

Regex regex = new Regex(@"/(?<firstMatch>[^\s]+)( )");

//anything between "/" and " ", sample tags: /NN, /VB, etc...

Now, i am getting the tag name into firstMatch group, so i can access them like

foreach (Match m in regex.Matches(allText))
{
    Console.WriteLine(m.Groups["firstMatch"].Value);
}

What i want to do is to replace the tag name with some other tag, depending on it's name. Like, if the tag name is DTI i want to replace it with DT. If it's NNS, i want to replace it with NN. And so on, from a list of tags that i have. Can i do that? I was thinking if there is a match-replace so i can use in that for.

Thanks!

Was it helpful?

Solution

Dictionary<string,string> tags = new Dictionary<string,string>();

public string UpadeInput(String input)
{
    tags.Add("DTI", "DT");
    tags.Add("NNS", "NN");
    tags.Add("LongAnnoyingTag", "ShortTag");
    MatchEvaluator evaluator = new MatchEvaluator(ModifyTag);
    return Regex.Replace(input,@"(?<=/)(?<firstMatch>[^\s]+)(?= )", evaluator);
}

public string ModifyTag(Match match)
{
    return tags[match.Value];
}

Edit for composed tag.

You can just change the ModifyTag method to work with different cases.

public string ModifyTag(Match match)
{
    String tag = match.Value;
    if(!tag.Contains("+"))
    {
        return tags[match.Value];
    }
    else
    {
        string[] composedTags = tag.Split('+');
        return String.Format("{0}+{1}", tags[composedTags[0]], tags[composedTags[1]]);
    }
}

OTHER TIPS

If I understood your question

Regex.Replace(input,"/(?<firstMatch>[^\s]+)[^\s](?= )","$1");

This would replace the tag name with the same tag name except last character..

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top