Question

I am working on a program that will automatically get your characters stats and whatnot from the wow armory. I already have the html, and i can identify where the string is, but i need to get the "this.effective" value, which in this case is 594. But since its always changing (and so are the other values, i cant just take it a certain position. Any help would GREATLY appreciated.

Thanks

Matt --------- This is the html snippet:

    function strengthObject() {
        this.base="168";
        this.effective="594";
        this.block="29";
        this.attack="1168";

this.diff=this.effective - this.base;


Was it helpful?

Solution

You can do it using regular expressions:

using System;
using System.Text.RegularExpressions;

class Program
{
    public static void Main()
    {
        string html = @"        function strengthObject() {
                this.base=""168"";
                this.effective=""594"";
                this.block=""29"";
                this.attack=""1168"";";

        string regex = @"this.effective=""(\d+)""";

        Match match = Regex.Match(html, regex);
        if (match.Success)
        {
            int effective = int.Parse(match.Groups[1].Value);
            Console.WriteLine("Effective = " + effective);
            // etc..
        }
        else
        {
            // Handle failure...
        }
    }
}

OTHER TIPS

It's much easier to extract the information from the XML version of the website.

If you make a request to a URL like this (Only with a valid character name) then you get back an XML document that you can use an XML parser to easily extract the data.

http://eu.wowarmory.com/character-sheet.xml?r=Nordrassil&cn=Someone

The URLs are the same as the ones you see in your web browser.

Please note though that you MUST set the User Agent field of the request to be that of a supported browser that supports the XML version of the file or you get back HTML instead. I use "Mozilla/5.0 Firefox/2.0.0.1" as the user agent in my program and it works fine.

Oh, also don't make more than a few requests in second, or an average of more than one request every 3 or 4 seconds or the site blocks your IP for a few hours ...

One way would be to use a regular expression to extract this value from the HTML source:

this.effective="(\d+)"

Note that HTML scraping is not an ideal solution (for example, it may break when the format of the HTML changes) however I don't know about the "wow armory" and what other ways there are to get this information.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top