modifying an element of a list in-place in J, can it be done?

https://stackoverflow.com/questions/7451323

21-01-2021
|

Question

I have been playing with an implementation of lookandsay (OEIS A005150) in J. I have made two versions, both very simple, using while. type control structures. One recurs, the other loops. Because I am compulsive, I started running comparative timing on the versions.

look and say is the sequence 1 11 21 1211 111221 that s, one one, two ones, etc.

For early elements of the list (up to around 20) the looping version wins, but only by a tiny amount. Timings around 30 cause the recursive version to win, by a large enough amount that the recursive version might be preferred if the stack space were adequate to support it. I looked at why, and I believe that it has to do with handling intermediate results. The 30th number in the sequence has 5808 digits. (32nd number, 9898 digits, 34th, 16774.)

When you are doing the problem with recursion, you can hold the intermediate results in the recursive call, and the unstacking at the end builds the results so that there is minimal handling of the results.

In the list version, you need a variable to hold the result. Every loop iteration causes you to need to add two elements to the result.

The problem, as I see it, is that I can't find any way in J to modify an extant array without completely reassigning it. So I am saying

try. o =. o,e,(0&{y) catch. o =. e,(0&{y) end.

to put an element into o where o might not have a value when we start. That may be notably slower than

o =. i.0
.
.
.
o =. (,o),e,(0&{y)

The point is that the result gets the wrong shape without the ravels, or so it seems. It is inheriting a shape from i.0 somehow.

But even functions like } amend don't modify a list, they return a list that has a modification made to it, and if you want to save the list you need to assign it. As the size of the assigned list increases (as you walk the the number from the beginning to the end making the next number) the assignment seems to take more time and more time. This assignment is really the only thing I can see that would make element 32, 9898 digits, take less time in the recursive version while element 20 (408 digits) takes less time in the loopy version.

The recursive version builds the return with:

e,(0&{y),(,lookandsay e }. y)

The above line is both the return line from the function and the recursion, so the whole return vector gets built at once as the call gets to the end of the string and everything unstacks.

In APL I thought that one could say something on the order of:

 a[1+rho a] <- new element

But when I try this in NARS2000 I find that it causes an index error. I don't have access to any other APL, I might be remembering this idiom from APL Plus, I doubt it worked this way in APL\360 or APL\1130. I might be misremembering it completely.

I can find no way to do that in J. It might be that there is no way to do that, but the next thought is to pre-allocate an array that could hold results, and to change individual entries. I see no way to do that either - that is, J does not seem to support the APL idiom:

a<- iota 5
a[3] <- -1

Is this one of those side effect things that is disallowed because of language purity?

Does the interpreter recognize a=. a,foo or some of its variants as a thing that it should fastpath to a[>:#a]=.foo internally?

This is the recursive version, just for the heck of it. I have tried a bunch of different versions and I believe that the longer the program, the slower, and generally, the more complex, the slower. Generally, the program can be chained so that if you want the nth number you can do lookandsay^: n ] y. I have tried a number of optimizations, but the problem I have is that I can't tell what environment I am sending my output into. If I could tell that I was sending it to the next iteration of the program I would send it as an array of digits rather than as a big number.

I also suspect that if I could figure out how to make a tacit version of the code, it would run faster, based on my finding that when I add something to the code that should make it shorter, it runs longer.

lookandsay=: 3 : 0
if. 0 = # ,y do.  return. end. NB. return on empty argument
if. 1 ~: #@$ y do.  NB. convert rank 0 argument to list of digits
y =. (10&#.^:_1) x: y
f =. 1
assert. 1 = #@$ y NB. the converted argument must be rank 1
else.
NB. yw =. y
f =. 0
end.
NB. e should be a count of the digits that match the leading digit.
e=.+/*./\y=0&{y

if. f do.

o=. e,(0&{y),(,lookandsay e }. y)
assert. e = 0&{ o
10&#. x: o
return.
else. 
e,(0&{y),(,lookandsay e }. y)
return.
end.
)

I was interested in the characteristics of the numbers produced. I found that if you start with a 1, the numerals never get higher than 3. If you start with a numeral higher than 3, it will survive as a singleton, and you can also get a number into the generated numbers by starting with something like 888888888 which will generate a number with one 9 in it and a single 8 at the end of the number. But other than the singletons, no digit gets higher than 3.

Edit: I did some more measuring. I had originally written the program to accept either a vector or a scalar, the idea being that internally I'd work with a vector. I had thought about passing a vector from one layer of code to the other, and I still might using a left argument to control code. With I pass the top level a vector the code runs enormously faster, so my guess is that most of the cpu is being eaten by converting very long numbers from vectors to digits. The recursive routine always passes down a vector when it recurs which might be why it is almost as fast as the loop.

That does not change my question.

I have an answer for this which I can't post for three hours. I will post it then, please don't do a ton of research to answer it.

Solution

assignments like

arr=. 'z' 15} arr

are executed in place. (See JWiki article for other supported in-place operations) Interpreter determines that only small portion of arr is updated and does not create entire new list to reassign.

What happens in your case is not that array is being reassigned, but that it grows many times in small increments, causing memory allocation and reallocation.

If you preallocate (by assigning it some large chunk of data), then you can modify it with } without too much penalty.

OTHER TIPS

After I asked this question, to be honest, I lost track of this web site.

Yes, the answer is that the language has no form that means "update in place, but if you use two forms

x =: x , most anything

x =: most anything } x

then the interpreter recognizes those as special and does update in place unless it can't. There are a number of other specials recognized by the interpreter, like:

199(1000&|@^)199

That combined operation is modular exponentiation. It never calculates the whole exponentiation, as

199(1000&|^)199

would - that just ends as _ without the @.

So it is worth reading the article on specials. I will mark someone else's answer up.

The link that sverre provided above ( http://www.jsoftware.com/jwiki/Essays/In-Place%20Operations ) shows the various operations that support modifying an existing array rather than creating a new one. They include:

    myarray=: myarray,'blah'

If you are interested in a tacit version of the lookandsay sequence see this submission to RosettaCode:

   las=: ,@((# , {.);.1~ 1 , 2 ~:/\ ])&.(10x&#.inv)@]^:(1+i.@[)
   5 las 1
11 21 1211 111221 312211

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow