Can (the underlying array of) a slice with large starting index in Go be allocated memory-efficiently?

StackOverflow https://stackoverflow.com/questions/22296232

  •  12-06-2023
  •  | 
  •  

質問

I'm trying to use a slice, say mySlice, with a very large starting index. Rather than explicitly subtracting the starting index by always using it as mySlice[index - mySliceStartIndex], I am tempted to simply define the slice in such a way that I can use it without such arithmetic as mySlice[index]. Can this be done without allocating memory for all the unused low indices?

The naive way of doing this, allocating a slice and then reslicing it (e.g. mySlice = mySlice[3*1024*1024*1024:4*1024*1024*1024]) is obviously memory inefficient because the underlying array not only needs to be allocated for the entire range, but remains allocated. does not even work, because afterwards the data formerly at index 3*1024*1024*1024 is now at index 0, whilst my goal is to keep it at the original index.

Can I allocate the slice (or its underlying array) in such a way that indices below the slice's start are not allocated, ideally not even initially?

役に立ちましたか?

解決

This will not be possible without actually /not/ allocating the unused parts. The way a slice is defined in Go, is through a reflect.SliceHeader

type SliceHeader struct {
    Data uintptr
    Len  int
    Cap  int
}

It contains no starting index field. Merely a reference to an underlying, fixed size array. It is this underlying array which holds your actual data. The slice is simply a 'window' into that array, which always begins at index 0. Wherever 0 may be in the underlying array.

For instance, consider the following code:

a := []int{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}
b := a[2:8]
c := a[8:]
d := b[2:4]

This yields a memory layout as follows:

fixed array: [ 0 1 2 3 4 5 6 7 8 9 ]  > [10]int at address 273785072
slice a    :   . . . . . . . . . .    > SliceHeader{Data:273785072 Len:10 Cap:10}
slice b    :       . . . . . .        > SliceHeader{Data:273785080 Len:6 Cap:8}
slice c    :                   . .    > SliceHeader{Data:273785104 Len:2 Cap:2}
slice d    :           . .            > SliceHeader{Data:273785088 Len:2 Cap:6}

The values for Data are simply address offsets into the fixed array and all four slices share the underlying storage.

a =:= $273785072
b =:= $273785080 =:= $a + sizeof(int)*2 =:= $a + 8
c =:= $273785104 =:= $a + sizeof(int)*8 =:= $a + 32
d =:= $273785088 =:= $b + sizeof(int)*2 =:= $a + sizeof(int)*4 =:= $a + 16

At whatever index you re-slice an existing slice, the new slice will always be indexed from 0 to len(s), because the address in the underlying fixed array it points to puts it there.

Memory mapping

If your data is loaded from file on a disk, you can have a different option: use syscall.Mmap to provide access to the data through a slice, starting at the desired index. The returned slice is now index from 0 and it covers only the range you specified.

func mmap(fd *os.File, start, size int) ([]byte, error) {
    _, err := fd.Seek(0, 0)
    if err != nil {
        return nil, err
    }

    return syscall.Mmap(int(fd.Fd()), start, size,
        syscall.PROT_READ, syscall.MAP_SHARED)
}

Do not forget to call syscall.Munmap on the returned slice, when you are done using it.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top