質問

I have a class that creates multiple WebClient classes with different proxies on multiple threads simultaneously.

Unfortunately, some instances of WebClient class take quite long to finish. Usually, I end up with ~20 threads that take a few minutes to finish. On the other hand, I spawn hundreds of threads which finish fast.

I tried to create extend the WebClient class and set the Timeout property to 20 seconds (as posted here), but it didn't change anything.

I'm not showing the whole code, because there would be quite a lot of it (WebClient is wrapped in another class). Still, I know the bottle-neck is WebClient.DownloadString(url), because all of the worker threads are processing this specific line whenever I pause debugging during that last step of executing code.

Here's how I use the extended WebClient:

public string GetHtml(string url)
{
    this.CheckValidity(url);

    var html = "";

    using (var client = new WebDownload())
    {
        client.Proxy = this.Proxy;
        client.Headers[HttpRequestHeader.UserAgent] = this.UserAgent;
        client.Timeout = this.Timeout;

        html = client.DownloadString(url);
    }

    return html;
 }

EDIT

I have just ran a few tests, and some of the threads take up to 7 minutes to finish, all contemplating the WebClient.DownloadString() statement.

Furthermore, I have tried setting ServicePointManager.DefaultConnectionLimit to int.MaxValue, unfortunately to no avail.

役に立ちましたか?

解決

Here's what I ended up doing.

I realized that the problem was, I needed simply to cancel WebClient.DownloadString() when it reached the specified timeout. Since I haven't found anything that would help me in WebClient, I simply called WebClient.DownloadStringTaskAsync(). This way, I could use Task.WaitAll with timeout to wait for WebClient to finish downloading string and then check if the task has finished (to rule out timeout).

Here's the code:

public string GetHtml(string url)
{
    var html = "";

    using (var client = new WebClient())
    {
        // Assign all the important stuff
        client.Proxy = this.Proxy;
        client.Headers[HttpRequestHeader.UserAgent] = this.UserAgent;

        // Run DownloadString() as a task.
        var task = client.DownloadStringTaskAsync(url);

        // Wait for the task to finish, or timeout
        Task.WaitAll(new Task<string>[] { task }, this.Timeout);

        // If timeout was reached, cancel task and throw an exception.
        if (task.IsCompleted == false)
        {
            client.CancelAsync();
            throw new TimeoutException();
        }

        // Otherwise, happy. :)
        html = task.Result;
    }
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top