Question

I'm working in Microsoft Visual C# 2008 Express.

Let's say I have a string and the contents of the string is: "This is my <myTag myTagAttrib="colorize">awesome</myTag> string."

I'm telling myself that I want to do something to the word "awesome" - possibly call a function that does something called "colorize".

What is the best way in C# to go about detecting that this tag exists and getting that attribute? I've worked a little with XElements and such in C#, but mostly to do with reading in and out XML files.

Thanks!

-Adeena

Was it helpful?

Solution

Another solution:

var myString = "This is my <myTag myTagAttrib='colorize'>awesome</myTag> string.";
try
{
    var document = XDocument.Parse("<root>" + myString + "</root>");
    var matches = ((System.Collections.IEnumerable)document.XPathEvaluate("myTag|myTag2")).Cast<XElement>();
    foreach (var element in matches)
    {
        switch (element.Name.ToString())
        {
            case "myTag":
                //do something with myTag like lookup attribute values and call other methods
                break;
            case "myTag2":
                //do something else with myTag2
                break;
        }
    }
}
catch (Exception e)
{
    //string was not not well formed xml
}

I also took into account your comment to Dabblernl where you want parse multiple attributes on multiple elements.

OTHER TIPS

You can extract the XML with a regular expression, load the extracted xml string in a XElement and go from there:

string text=@"This is my<myTag myTagAttrib='colorize'>awesome</myTag> text.";
Match match=Regex.Match(text,@"(<MyTag.*</MyTag>)");
string xml=match.Captures[0].Value;
XElement element=XElement.Parse(xml);
XAttribute attribute=element.Attribute("myTagAttrib");
if(attribute.Value=="colorize") DoSomethingWith(element.Value);// Value=awesome

This code will throw an exception if no MyTag element was found, but that can be remedied by inserting a line of:

if(match.Captures.Count!=0)
{...}

It gets even more interesting if the string could hold more than just the MyTag Tag...

I'm a little confused about your example, because you switch between the string (text content), tags, and attributes. But I think what you want is XPath.

So if your XML stream looks like this:

<adeena/><parent><child x="this is my awesome string">This is another awesome string<child/><adeena/>

You'd use an XPath expression that looks like this to find the attribute:

//child/@x

and one like this to find the text value under the child tag:

//child

I'm a Java developer, so I don't know what XML libraries you'd use to do this. But you'll need a DOM parser to create a W3C Document class instance for you by reading in the XML file and then using XPath to pluck out the values.

There's a good XPath tutorial from the W3C schools if you need it.

UPDATE:

If you're saying that you already have an XML stream as String, then the answer is to not read it from a file but from the String itself. Java has abstractions called InputStream and Reader that handle streams of bytes and chars, respectively. The source can be a file, a string, etc. Check your C# DOM API to see if it has something similar. You'll pass the string to a parser that will give back a DOM object that you can manipulate.

Since the input is not well-formed XML you won't be able to parse it with any of the built in XML libraries. You'd need a regular expression to extract the well-formed piece. You could probably use one of the more forgiving HTML parsers like HtmlAgilityPack on CodePlex.

The XmlTextReader can parse XML fragments with a special constructor which may help in this situation, but I'm not positive about that.

There's an in-depth article here:

http://geekswithblogs.net/kobush/archive/2006/04/20/75717.aspx

This is my solution to match any type of xml using Regex: C# Better way to detect XML?

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top