Вопрос

I just spent some time scratching my head when a breakpoint seemed to magically appear twice in the same spot, inside an enumerator.

It turned out the bug was a straight forward oversight:

    protected override void Extract()
    {
        LogGettingOffers();
        var offerIds = CakeMarketingUtility.OfferIds(advertiserId);
        LogExtractingClicks(offerIds);
        foreach (var offerId in offerIds)
        {
            int rowCount;
            var clicks = RetryUtility.Retry(3, 10000, new[] { typeof(Exception) }, () =>
            {
                return CakeMarketingUtility.EnumerateClicks(dateRange, advertiserId, offerId);
            });
            foreach (var clickBatch in clicks.InBatches(1000))
            {
                LogExtractedClicks(offerId, clickBatch);

                // SHOULD BE clickBatch, NOT clicks
                Add(clicks);
            }
        }
        End();
    }

This leads me to wonder what (if any) kind of preventative measures one might take to write code which catches an error like this.

Note, I'm not positive it makes sense to go down this line of thought - maybe the answer is "don't write incorrect code", which I'm willing to accept..

Here's the actual code that yields the results:

    public static IEnumerable<Click> EnumerateClicks(DateRange dateRange, int advertiserId, int offerId)
    {
        // initialize to start at the first row
        int startAtRow = 1;

        // hard code an upper limit for the max number of rows to be returned in one call
        int rowLimitForOneCall = 5000;

        bool done = false;
        int total = 0;
        while (!done)
        {
            Logger.Info("Extracted a total of {0} rows, checking for more, starting at row {1}..", total, startAtRow);

            // prepare the request
            var request = new ClicksRequest
            {
                start_date = dateRange.FromDate.ToString("MM/dd/yyyy"),
                end_date = dateRange.ToDate.ToString("MM/dd/yyyy"),
                advertiser_id = advertiserId,
                offer_id = offerId,
                row_limit = rowLimitForOneCall,
                start_at_row = startAtRow
            };

            // create the client, call the service and check the response
            var client = new ClicksClient();
            var response = client.Clicks(request);
            if (!response.Success)
            {
                throw new Exception("ClicksClient failed");
            }

            // update the running total
            total += response.RowCount;

            // return result
            foreach (var click in response.Clicks)
                yield return click;

            // update stopping condition for loop
            done = (response.RowCount < rowLimitForOneCall);

            // increment start row for next iteration
            startAtRow += rowLimitForOneCall;
        }

        Logger.Info("Extracted a total of {0}, done.", total);
    }
Это было полезно?

Решение

For this specific issue, I'd say that the solution is "don't write incorrect code". Especially when results can be generated without altering any state (like when you enumerate elements from a list), I think it should be okay to create multiple enumerators from any enumerable.

You could create an IEnumerable wrapper that would ensure GetEnumerator is called only once, but what if you actually legitimately need to call it twice? What you really want is to catch mistakes, not to catch enumerables being enumerated multiple times, and that's not something you can easily put in a software solution.

Maybe the issue was that clickBatch and clicks have the same type, so the compiler couldn't distinguish between either.

Другие советы

There are times when I need to make sure that enumerables that I expose are only called once. For example: returning streaming information that I only have one read available, or very expensive queries.

Try the following extension class:

public static class Extensions
{
    public static IEnumerable<T> SingleEnumeration<T>(this IEnumerable<T> source)
    {
        return new SingleEnumerator<T>(source);
    }
}

public class SingleEnumerator<T> : IEnumerable<T>
{
    public SingleEnumerator(IEnumerable<T> source)
    {
        this.source = source;
    }

    public IEnumerator<T> GetEnumerator()
    {
        // return an empty stream if called twice (or throw)
        if (source == null)
            return (new T[0]).AsEnumerable().GetEnumerator();

        // return the actual stream
        var result =source.GetEnumerator();
        source = null;
        return result;
    }

    System.Collections.IEnumerator System.Collections.IEnumerable.GetEnumerator()
    {
        // return an empty stream if called twice (or throw)
        if (source == null)
            return (new T[0]).AsEnumerable().GetEnumerator();

        var result = source.GetEnumerator();
        source = null;
        return result;
    }

    private IEnumerable<T> source;
}
Лицензировано под: CC-BY-SA с атрибуция
Не связан с StackOverflow
scroll top