Question

I'm numbering generated files with two digits 00-99 and I want to retain the last 50. The algorithm should tell me the number of the current file that I'm about to save and which of the previous files to remove.

If I scale it down to a toy version with one digits, here's how I want it to behave.

  1. As long as there are fewer than 5 files, we can just add files. This is represented in the first block below.
  2. Once there are 5 previous files, the oldest one should be deleted. This is trivial as long as we haven't reached the maximum number, here 9. This is represented in the second block below.
  3. Once the highest numbers has been reached, the numbering should roll around, and here is where things get tricky. See the third block below.
__prev__    __curr__   __drop__  
<no files>     0         null
0              1         null
0 1            2         null
0 1 2          3         null
0 1 2 3        4         null

0 1 2 3 4      5          0 
1 2 3 4 5      6          1
2 3 4 5 6      7          2
3 4 5 6 7      8          3
4 5 6 7 8      9          4

5 6 7 8 9      0          5
6 7 8 9 0      1          6
7 8 9 0 1      2          7
8 9 0 1 2      3          8
9 0 1 2 3      4          9

Some pseudocode to illustrate my progress:

if length(prev) < 5
   curr = length(prev)
   drop = null

else

if 9 not in prev
   curr = (1+max(prev)) mod(10)
   drop = curr-5

if 9 in prev

... and this is where I'm stuck. I've experimented with subtracting 10 from each element in prev.

I'm sure it's trivial and that It'll hit me when I least expect it. I also have a nagging feeling I'll find an expression that generalizes both cases (with and without max in prev).

Was it helpful?

Solution

What you have been trying to implement is a first-in-first-out(FIFO) fixed-sized data structure, which is commonly called a circular queue, circular buffer, cyclic buffer or ring buffer.

In addition to the requirement of circular queue, you want the numbers rolling around an interval such as from 0 to 99. Since all the numbers in use are contiguous, it is enough to use just two numbers, the smallest of them and the largest of them to represent them.

In fact, since the number of files is a constant once the maximum number of files have been reached, it is enough if we keep track of the count of all numbers ever used. So we can construct a much simplified special implementation of the circular queue.

Similarly to your code, we should use the modulo technique. Here is a simple and fully-working Javascript implementation. Both get operation and update operation runs in $O(1)$.

// From 0 to limit-1 are all usable numbers.
// At any moment, at most maximumKept numbers can be retained.
// For example, if we use 00-99 to parametrize names of at most 50 files,
// we should use createCircularQueue(100, 50).
function CreateCircularQueue(limit, maximumKept) {
    this.limit = limit;
    this.kept = maximumKept;

    this.count = 0;  // count of all numbers ever used.

    // return the number to remove and the next available number.
    this.get = function () {
        if (this.count <= this.kept - 1) {
            return [null, this.count];
        }
        else {
            const next = this.count % this.limit;
            let toRemove = next - this.kept;

            return [toRemove + ((toRemove >= 0) ? 0 : this.limit), next];
        }
    };

    this.update = function () {
        this.count += 1;  // this will NOT overflow before 2**53-1=9007199254740991.
    }

}

The above implementation works for all possible integer pairs limit >= maximumKept. In particular, it works even if limit is maximumKept.

Here is a sample usage:

function myLog(wantedNumbers) {
    console.log("number to remove: " + wantedNumbers[0]
        + "    next number to use: " + wantedNumbers[1])
}

let cq = new CreateCircularQueue(5, 3);
myLog(cq.get());
cq.update();
myLog(cq.get());
cq.update(); cq.update();
myLog(cq.get());
cq.update();
myLog(cq.get());
cq.update(); cq.update();
myLog(cq.get());

OTHER TIPS

It's funny how explaining a problem makes you approach it differently (rubber duck, anyone?).

I think I've managed to cobble together the behaviour that I'm looking for. The idea is that if the list contains the last number (i.e. n-1, here 9), n is added to every number greater than n/2.

Now, Javascript is definately not my first language, so please be kind and try to see the idea rather than the mess that is my js ...

Play with it here or just take in the view below. Of course I'd be grateful for a better solution.

function roll(n) {
  const half = n/2 ;
  const last = n-1 ;
  var iter = half * 3 ;
  var prev = [] ;
  var curr = null ;
  var drop = n ;
  var temp ;

  while(iter--) {
    console.log(prev) ;
    curr = prev.length ;
    if (curr < half) {
      prev = update(prev,curr,drop) ;
      continue ;
    }
    temp = [].concat(prev) ;
    if (temp.includes(last)) {
      for (var t=0 ; t<temp.length ; t++) {
        if (temp[t] < half)
          temp[t] += n ;
      }
    }
    drop = Math.min.apply(null, temp) % n ;
    curr = Math.max.apply(null, temp) + 1 ;
    curr %= n ;
    prev = update(prev,curr,drop) ;
  }
}

function update(p,c,d) {
  for (var i=0 ; i<p.length ; i++) {
    if (p[i] == d)
      p.splice(i,1) ;
  } 
  p.push(c) ;
  return p ;
}

roll(10) ;

The behavior of the modulo function in many programming languages is a bit quirky, and might depend on the sign of the argument. However, if you compute $a \bmod b$ for positive $a,b$, you will get an answer in the range $0,\ldots,b-1$.

Now suppose we want to compute a difference $x-y \pmod{b}$, and our goal is to obtain an answer in the range $0,\ldots,b-1$. We can assume that $x,y$ are in the same range. If $y > x$, then $x - y$ would be negative, and so we might get a negative answer. To prevent that, we can compute $x + (b-y)$ instead. This will always give the correct answer.

Licensed under: CC-BY-SA with attribution
Not affiliated with cs.stackexchange
scroll top