Question

NumberLine line = new NumberLine();
line.AddRange(1, 5);
line.AddRange(20, 30);

line.CheckRange(10, 25);

NumberLine is a class which represents a number line. I want to mark on it different ranges of numbers. The CheckRange method should return which parts from 10-25 I marked and which I did not. In this case it should return that 10-20 is not marked, and that 20-25 is marked.

How can I implement an effective implementation of this, which wouldn't do o(n)?

Thank you.

NOTE: This is NOT homework. I need this for my custom database implementation transactions. I'm learning programming alone.

Was it helpful?

Solution

The solution is simple: Add all highlighted values to an AVL or Red-black tree. I mean when you do AddRange(1,3), insert the integer values 1,2 and 3 in the tree.

When checking ranges, simply lookup the endpoint values. That takes O(log n) which is considerably faster than O(n).

Note: Insertion and delete all take O(log n).

OTHER TIPS

Use a HashSet<T>:

public class NumberLine : HashSet<int>
{
    public void AddRange(int start, int end)
    {
        int count = (end-start)+1;

        UnionWith(Enumerable.Range(start, count));
    }

    public IEnumerable<int> CheckRange(int start, int end)
    {
        NumberLine other = new NumberLine();

        other.AddRange(start, end);

        other.IntersectWith(this); // marked
        // other.ExceptWith(this); // not marked

        return other;
    }
}

Not sure what you wanted to return from CheckRange, or if you just wanted it to print a string. For something simple like the ranges you specified, you could use:

public string CheckRange(int start, int end)
{
    NumberLine other = new NumberLine();

    other.AddRange(start, end);

    IEnumerable<int> marked = other.Intersect(this);
    IEnumerable<int> notMarked = other.Except(this);

    int markedMin = marked.Min();
    int markedMax = marked.Max();
    int notMarkedMin = notMarked.Min();
    int notMarkedMax = notMarked.Max();

    string markedString = (markedMin == markedMax)
            ? markedMin.ToString()
            : string.Format("{0} - {1}", markedMin, markedMax);

    string notMarkedString = (notMarkedMin == notMarkedMax)
            ? notMarkedMin.ToString()
            : string.Format("{0} - {1}", notMarkedMin, notMarkedMax);

    return string.Format("Marked: {0}\r\nNot Marked: {1}", markedString, notMarkedString);
}

It won't handle split ranges like:

Marked: 10-15, 20-25
Not Marked: 16-19

But it should get you on the right track.

OK, I see where you're going with this.

Lucene does this with very large bit fields.

Say your possible range of numbers goes from 1 to 64, each of these numbers corresponds to the bit in that position on a 64 bit int. (No 1 is bit 0, No 2 is bit 1).

If you add a number to a range, you switch on that bit (in your example you'd switch on bits 0 through 4 and 19 through 29).

Now to check a range of numbers, you create another 64 bit int with that range of bits switched on, and perform a bitwise And (&) on the two bit fields. The 1 bits in the result is the overlapping range.

For more numbers than 64, just scale up the number of bits (possibly by working with arrays or lists of integers)

Hope this helps :)

Update: Scalability

Lets say you're working on 64 bit architecture and you can AND 64 bit integers in a single operation. Ideally you'd use 64 bit ints.

Now, lets say your possible range of numbers runs from 1 to 64,000, for this you need 1000 64 bit ints.

Now lets look at a couple use cases

  1. I want to check the range of 70 - 80. To do this we don't need another 1000 ints for the check, just one int, and we know we're checking it against the 2nd element in our array.

  2. I want to check the range of 2000 - 10,000 Again, we only need one int, calculate it's position in the array 31st (I think) and set the bits accordingly and compare. Then you iterate through the list until you hit 10,000 (position 156?), comparing along the way, and building you're list of integers to return.

Update 2: That's not O(1)

Depending of the size of the range to check, you can implement this as O(1)

However using this algorithm the general case is still O(n)

What about if you store the ranges themselves within the NumberLine. You can do a merge when overlapping ranges are added. CheckRange then can query the ranges that are stored inside NumberLine instead of the individual elements. This then becomes O(N) in the number of ranges, instead of O(N) in the number of elements. If you do merge ranges when possible then the number of ranges becomes smaller than the number of calls to AddRange.

See the code sample below. I'm not an expert on .Net collections so a more efficient implementation might be possible by selecting better collection types. _NT suggested storing values in a tree structure. You can apply this also to ranges and store them by the start number. This makes searching the ranges faster, both when adding and checking. In my current implementation adding Ranges to the end is slower than adding ranges at the beginning. When storing this in an efficient tree the complexity becomes O(log N) in the number of ranges.

using System;
using System.Collections.Generic;
using System.Collections.ObjectModel;

namespace NumberLine
{
    class Program
    {
        static void Main(string[] args)
        {
            NumberLine line = new NumberLine();
            line.AddRange(1, 5);
            line.AddRange(10, 12);
            line.AddRange(20, 30);

            List<Range> ranges = line.CheckRange(10, 25);
            foreach (Range r in ranges)
            {
                for (int i = r.Start; i <= r.End; i++)
                {
                    Console.WriteLine(i);
                }
            }
        }
    }

    class Range
    {
        public int Start;
        public int End;
    }

    class NumberLine
    {
        private SortedList<int, Range> Ranges = new SortedList<int, Range>();

        public void AddRange(int start, int end)
        {
            if (Ranges.Count == 0)
            {
                 Ranges.Add(start, new Range() { Start = start, End = end });
            }
            else
            {
                foreach (Range currentRange in Ranges.Values)
                {
                    if (start <= currentRange.Start) 
                    {
                        if (end >= currentRange.End)
                        {
                            currentRange.Start = start;
                            currentRange.End = end;
                        }
                        else
                        {
                            currentRange.Start = start;
                        }
                        Ranges.RemoveAt(start);
                        Ranges.Add(start, currentRange);
                        break;
                    } 
                    else
                    {
                        if (start <= currentRange.End)
                        {
                            currentRange.End = end;
                            break;
                        }
                        else
                        {
                            Ranges.Add(start, new Range(){ Start = start, End = end });
                            break;
                        }
                    }
                }           
            }
        }

        public List<Range> CheckRange(int start, int end)
        {
            List<Range> result = new List<Range>();
            foreach (Range currentRange in Ranges.Values)
            {
                if (start <= currentRange.End)
                {
                    if (end <= currentRange.End)
                    {
                        result.Add(new Range() { Start = currentRange.Start, End = end });
                        break;
                    }
                    else
                    {
                        if (start <= currentRange.Start)
                        {
                            result.Add(new Range() { Start = currentRange.Start, End = currentRange.End });
                        }
                        else
                        {
                            result.Add(new Range() { Start = start, End = currentRange.End });
                        }
                    }
                }
            }
            return result;
        }
    }

}

O(n) means varies as the number of elements O(1) means constant time

I can't think of an O(1) way of implementing this, either.

I'm not sure about the specifics of the app, but my intuition tells me this would be much better handled in the database, since it is a set based operation.

I.E.

Select
*
from numberlines
where 
number_group = @group_id
marked = 1
and number >= @min_range
and number <= @max_range

If you tried to tackle this problem in iteratively that may help. For instance load your LineNumber class up with a list of Ranges, Those ranges have a start and end int in them. Then instead of a 'checkrange(a,b)' method, just implement a 'hasNumber(a)' method. Do this simply by looping through your list of Ranges and call a method 'isInRange(a) on the Range class So your Data model might be:

LineNumber {
 List<Range> ranges;
 aadRange(a,b);
 // Loops through all ranges and calls isInRange on each method
 isInRange(a);

 //just iterates over isInRange from a to b
 checkRange(a,b)

}

Range {
 Range(a,b)
 isInRange(a);
}

This will get you some working code, and an interface. It may not be fast enough, but you don't know that yet. Leave the lucene implementation for later. :)

This is not a full solution, but perhaps a different approach could help yield better results.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top