Domanda

There are many Subset-sum questions and answers around at SO, but somehow I can't find a solution for my specific problem.

I need to find the quantity and length of track-segments to build a track of length n. The segments are of length: 8, 10, 12, 14, 16, 18, 20, 22, 24 ft The quantity can be up to 4 segments The track length is ~ between 20 and 100 ft. (and always an even number)

This is a real track. Order of segments does not matter. However, there are preferred size combinations. All of equal length or close to each other is preferred over Large/small combinations.

i.e:

  • 70 => 20,20,20,10 would be an easy pick, but 16,18,18,18 would be preferred
  • 60 => 20,20,20 is better then 14,14,14,18

I can trow on more examples if required.

I'm not looking for the one best solution, but a small set of possible best fit solution. Ultimately a person will choose, this is about suggestion best options.

What I did so far is the following. It is working, just seems way to complex.

I took the basic algorithm from this post Algorithm to find which numbers from a list of size n sum to another number. All I changed in this code is turn it into integer. It return all possible combinations. up to 5 or more tracks.

To further reduce result set, I do a few Linq's

    List<int> nums = new List<int>() { 8, 8, 8, 8, 10, 10, 10, 10, 12, 12, 12, 12, 14, 14, 14, 14, 16, 16, 16, 16, 18, 18, 18, 18, 20, 20, 20, 20, 22, 22, 22, 22, 24, 24, 24, 24 };
    int width = 60;
    Console.WriteLine("Total width: " + width);
    Solver solver = new Solver();
    List<List<int>> res = solver.Solve(width, nums.ToArray());

    Console.WriteLine("total :"+res.Count);
    var res1 = res.Distinct(new SequenceComparer<List<int>, int>()).ToList();
    Console.WriteLine("total res1:" + res1.Count);

    var res2 = res1.Where(l => l.Count == 4).ToList();
    Console.WriteLine("total res2:" + res2.Count); //reduce to 4 integer solutions

    var res3 = (from row in res2
             where row[0] == row[1] || row[0] == row[2] || row[0] == row[3] ||
                   row[1] == row[2] || row[1] == row[3] ||
                   row[2] == row[3]
             select row).ToList();
    Console.WriteLine("total res3:" + res3.Count); //reduce to at least two of equal length

    var res4 = (from row in res3
            where row[0] == row[1] && row[0] == row[2] || 
                  row[0] == row[1] && row[0] == row[3] ||
                  row[1] == row[2] && row[1] == row[3] 
            select row).ToList();
     Console.WriteLine("total res4:" + res4.Count); //reduce to  three of equal length

    var res5 = (from row in res4
            where row[0] == row[1] && row[0] == row[2] && row[0] == row[3]
           select row).ToList();

     Console.WriteLine("total res5:" + res5.Count); //reduce to 4 of equal length
     Console.WriteLine("-------------------------------------");
     Console.WriteLine("res4:");
     foreach (List<int> result in res4)
     {
         foreach (int value in result)
         {
              Console.Write("{0}\t", value);
         }
         Console.WriteLine();
      }
      Console.WriteLine("-------------------------------------");
      Console.WriteLine("res5:");
      foreach (List<int> result in res5)
      {
           foreach (int value in result)
           {
               Console.Write("{0}\t", value);
           }
           Console.WriteLine();
      }
      Console.ReadLine();

This code will produce the following outcome, run with 60

    Total width: 60
    total :10726
    total res1:74
    total res2:31
    total res3:20
    total res4:3
    total res5:0
    -------------------------------------
    res4:
    12      12      12      24
    12      16      16      16
    14      14      14      18
     -------------------------------------
    res5:

or with 80 this

    Total width: 80
    total :101560
    total res1:237
    total res2:15
    total res3:13
    total res4:3
    total res5:1
    ------------------------------------
    res4:
    8       24      24      24
    14      22      22      22
    20      20      20      20
    ------------------------------------
    res5:
    20      20      20      20

So my final results (4&5) are, in fact, close to what I need.

BUT I would have to code the same again for any possible 3 track solution, and maybe 2 tracks.

Then does results would need to be compared to each other (somehow, not sure how). ALL of this makes me feel like I am missing something. It feels to complex, it feels wrong. What am I missing?

AM I using the wrong algorithm to begin with? Are their better once out their for my problem?

È stato utile?

Soluzione 3

Math to the rescue!

You can check that every even number larger than 8 is a linear combination of elements from this set - ask on Math Overflow for a proof ;).

So let's rephrase the question in math:

  • We have an overcomplete dictionary of basis vectors (because 16, for example, is a multiple of 8),
  • in which we can represent every even number larger than 8 as a linear combination of these basis vectors, and
  • we are trying to minimize the zero "norm" of this input vector.

The good news: This is a very interesting problem, with many application domains, so it is pretty well researched.

The bad news: this is still a hard (NP-hard) problem.

But, hey, at least now you know.

edit: And just so I don't get accused of handwaving philosophical answer, here's a modified (completely non-optimized) version of Solver.recursiveSolve that exhaustively searches for a combination of segments that match the goal; and a zero norm comparer class with which you can sort your results:

    private void RecursiveSolve(int goal, int currentSum,
        List<int> included, List<int> segments, int startIndex)
    {

        for (int index = 0; index < segments.Count; index++)
        {
            int nextValue = segments[index];
            if (currentSum + nextValue == goal)
            {
                List<int> newResult = new List<int>(included);
                newResult.Add(nextValue);
                mResults.Add(newResult);
            }
            else if (currentSum + nextValue < goal)
            {
                List<int> nextIncluded = new List<int>(included);
                nextIncluded.Add(nextValue);
                RecursiveSolve(goal, currentSum + nextValue,
                    nextIncluded, segments, startIndex++);
            }
        }
    }

class ZeroNormAndSDComparer : IComparer<List<int>>
{
    private int l0_norm(List<int> v)
    {
        int norm = 0;
        HashSet<int> seen = new HashSet<int>();

        for (int i = 0; i < v.Count; ++i)
        {
            if (!seen.Contains(v[i]))
            {
                norm++;
                seen.Add(v[i]);
            }
        }
        return norm;
    }

    private static double StandardDeviation(List<int> v)
    {
        double M = 0.0;
        double S = 0.0;
        int k = 1;
        foreach (double value in v)
        {
            double tmpM = M;
            M += (value - tmpM) / k;
            S += (value - tmpM) * (value - M);
            k++;
        }
        return Math.Sqrt(S / (k - 1));
    }

    public int Compare(List<int> x, List<int> y)
    {
        int normComparison = l0_norm(x).CompareTo(l0_norm(y));
        if  (normComparison == 0)
        {
            return StandardDeviation(x).CompareTo(StandardDeviation(y));
        }
        return normComparison;
    }
}

And your modified Main (now sorting is done after results have been reduced to distinct, 4-segment results):

        List<int> nums = new List<int>() { 8, 10, 12, 14, 16, 18, 20, 22, 24 };

        int width = 80;
        Console.WriteLine("Total width: " + width);

        Solver solver = new Solver();
        List<List<int>> res = solver.Solve(width, nums.ToArray());
        Console.WriteLine("total :" + res.Count);
        var res1 = res.Distinct(new SequenceComparer<List<int>, int>()).ToList();
        Console.WriteLine("total res1:" + res1.Count);

        var res2 = res1.Where(l => l.Count == 4).ToList();
        Console.WriteLine("total res2:" + res2.Count); //reduce to 4 integer solutions

        res2.Sort(new ZeroNormAndSDComparer());

Altri suggerimenti

Let's divide everything by 2 since everything is even. We now have track pieces with length 4 to 12 for a total length of about 10 to 50.

Name n the length we have to achieve. For every possible number k of track pieces (1 to 4 in general, but 1 to 3 for n<16 or 3 to 4 for n>36 for example), suggest taking n%k pieces of length n/k+1 and k-n%k pieces of length n/k.

'/' designates integer division and '%' the remainder.

You really have a special case of the sum-set. While it's NP-hard, the solution space is restricted enough that brute force would probably work fine, as there are only 10000 (10 ^ 4) possible solutions total (which is about equal to the actual number of time steps needed), since you also have to consider 0 as a possible length.

Here is the code, in psudo-Python. Thought about trying it in C#, but not actually familiar with it, and so it probably wouldn't work out well.

lengths = [0, 8, 10, 12, 14, 16, 18, 20, 22, 24]
solutions = []
for len1 in lengths:
    for len2 in lengths:
        for len3 in lengths:
            for len4 in lengths:
                if (len1 + len2 + len3 + len4) == n:
                    solutions.append([len1,len2,len3,len4])
return solutions

Once you have all the valid solutions, you can then simply determine which one(s) you want to show the user, or you can simply write an algorithm to pick the best one. Of course, you'll probably want to not actually include any lengths of size 0.

You can improve this algorithm somewhat using a greedy method that will only find all valid solutions. However, again, the problem as it's stated isn't likely to be complex enough to need it unless things are very constricted in terms of space or time.

As a bonus, if you are expecting multiple queries (for example user asks for n = 40 and later n = 50), you can remove the if statement and simply store all 10000 solutions in a hash-table keyed to the sum value, allowing for O(1) querying.

Narrowing down the solution set:

What you need here is a comparison algorithm that basically compares this solution to that solution and says, "this solution is better/worst than that solution". This allows you to write a sorting algorithm that you can use to sort the solutions to get the best so many, or you can simply just find the best solution.

You can resolve the vast majority of cases by simply calculating the standard deviation for each solution then compare the standard deviations. This will give a number that shows how much variance is in the lengths of the solution. If you use "lower standard deviation is better", then that'll give you "All of equal length or close to each other is preferred over Large/small combinations". The algorithm for standard deviation is fairly straightforward, so I'll leave it to you try to implement. Actually, there's a good chance that C# has the function built in. Just be sure not to include any zero lengths, actually they should probably be removed before you add them to the solution set to avoid issues, which requires a bit of tweaking to the code I gave.

However, you then have the tricky part, dealing with cases where different solutions have the same standard deviation. I'm thinking there are only two cases where this happens.

  1. The first occurs only there are multiple ideal solutions. Such as if n = 24, then three of the solutions will be [8,8,8], [12,12], and [24].

  2. The second occurs due to the brute force nature of the algorithm, and is why there are so many solutions. This is because for every solution like [8,10,12,14] (four unique lengths) there are 24 ways to arrange those lengths, such as [14,12,10,8] and [12,10,14,8]. So the best way to improve upon the brute force algorithm is to have an algorithm that multichooses 4 elements from [0, 8, 10, 12, 14, 16, 18, 20, 22, 24]. This narrows the solution set to only 715 solutions. Of course, if you actually want [8,10,12,14], [14,12,10,8] and [12,10,14,8] as different solutions, then there isn't much you can do.

The above two paragraphs fall squarely in the realm of "it depends". You'll have to decide what rules the algorithm should follow in each case, but I'm thinking those are the only two cases where you could find identical standard deviations.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top