Pregunta

Say, I want to add two vectors (in a mathematical sense) of something numerical. Naturally I'd do something like:

T[] add(T)(T[] a, T[] b) {
    assert(a.length == b.length);
    T[] res = a.dup;
    foreach (i; 0 .. a.length) {
        res[i] = a[i] + b[i];
    }
   return res;
}

Well, it's ok, but I suspect a and b copying every call, which is not so great. So I declare them ref.

T[] add(T)(ref T[] a, ref T[] b) { ...

It works nice while passing variables, but for a test I use array instances:

assert(add([1, 2, 3], [4, 5, 6]) == [5, 7, 9]);

And this fails as it can't deduce refs for arrays. I've managed to figure out a workaround:

T[] add(T)(T[] a, T[] b) {
    return add(a, b);
}

Which seems to solve the problem, but looks rather silly. What is the better design for my problem?

Or putting it into smaller questions: do I really have to declare arguments as ref to avoid copying? Could the compiler, as I don't modify a and b, optimize this for me? How'd I declare arguments immutable by the way (I tried immutable keyword, looks like I'm using it wrong)? Would res be really copied twice in the workaround, or returns are made through move?

¿Fue útil?

Solución

You really should read http://dlang.org/d-array-article.html. It goes into detail about how arrays in D work. But for a short answer, all that's getting copied when passing arguments to

T[] add(T)(T[] a, T[] b) {...}

is the pointer's underlying pointer and length. None of the elements are copied. Rather, the array is "sliced." The resulting arrays inside of add are slices of add's arguments, meaning that they refer to the same memory that the original arrays did. Mutating the elements of a slice will mutate the elements of the original array, because they are the same elements. However, mutating the array itself (e.g. assigning another array to it or appending to it) do not affect the original, and if appending to the array results in its memory being reallocated to make room (or if a new array is assigned to the array), then that array will no longer refer to the same elements as the original. The only place in your code that a copy of an array is being made is a.dup.

What marking the arrays with ref does is make it so that they're not sliced. Rather, they are the original arrays instead of slices. So, if anything is appended to the local array or if it's reassigned, then that will affect the original array (whereas it wouldn't have if you hadn't used ref).

Also, ref only accepts lvalues (meaning values that can go on the left-hand side of an assignment), and array literals are rvalues (meaning that they can only go on the right-hand side of an assignment), so you can't pass them to a function that takes its argument by ref. If you want to accept both, you either have to not accept by ref, overload your function so that you have a ref and non-ref version (which appears to what you've used as your solution), or use auto ref instead of ref, in which case it'll accept both (but auto ref only works with templated functions, and it's basically just short-hand for duplicating the function yourself, because that's what auto ref does). In general though, if you don't want to be mutating the original, you shouldn't be passing by ref.

One suggestion to make your code faster: there's no reason to dup a and then loop over it again and add it along with b. If that's what you want to do, you might as well just just use += and do something more like

T[] add(T)(T[] a, T[] b)
{
    assert(a.length == b.length);
    auto res = a.dup;
    foreach (i; 0 .. a.length)
        res[i] += b[i];
   return res;
}

Or even better, you could use array vector operations and skip the loop entirely:

T[] add(T)(T[] a, T[] b)
{
    assert(a.length == b.length);
    auto res = a.dup;
    res[] += b[];
    return res;
}

But again, you really should read http://dlang.org/d-array-article.html if you want to properly understand how arrays work in D.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top