Understanding Golang Slice

Updated At Sat, 09 Jul 2022 17:22:22 GMT
Photo from pexels

Photo by Pexels

Slice is one of the data structure that is commonly used in go. But have we used it to its full potential? In this post, we will try to understand how go slice works so we can fully utilize it. But before we start using we have to understand what it is. I mean if you want to use a hammer correctly then you have to know what is hammer right?

What is Slice?

Well based on golang's spec we found this definition of a slice :

Slice Definition
A slice is a descriptor for a contiguous segment of an underlying array and provides access to a numbered sequence of elements from that array. A slice type denotes the set of all slices of arrays of its element type. The number of elements is called the length of the slice and is never negative. The value of an uninitialized slice is nil.

So based on that definition, we are aware that slice is just a descriptor of an array. That's mean there is an array inside of it (Let's prove it later). Also based on the definition above, we know that slice i meant to provide (instant) access to a numbered sequence of elements. So we can deduce that it best used when you know where to find the element given it's sequence number. It also has length and also a capacity.

Now let's take a look at slice's implementation in golang's source code. Below are the code taken from golang repo

type%20slice%20struct%20%7B%0A%09array%20unsafe.Pointer%0A%09len%20%20%20int%0A%09cap%20%20%20int%0A%7D

Well, it's proven that slice indeed have length and capacity. It's also backed by array in it's unsafe.Pointer form, maybe it's for pointer arithmetic? Let's find out later about it. Also, i think that's it for the description, we can safely move to the how to use slice.

How to use it?

In this part we will learn some operation with slice that is listed below :

  • How to make it.
  • How to access it.
  • How to insert new element into it.

So maybe we should start from the basic first?

How to make slice

Based on the golang's spec, a slice can be made from 2 different syntax, the first one is make([]T, length, capacity) and the second one is []T{elems...} . Tough both of them is used quite differently. Let's take a look at the first syntax first.

In the make([]T, length, capacity) syntax, the length and the capacity is optional so you can omit it. Below are the valid syntax if you decide to use it.

sliceB%20:=%20make(%5B%5Dint,%201)%0AsliceC%20:=%20make(%5B%5Dint,%201,%202)

  1. sliceB will have length of 1 and capacity bigger or equal than 1.
  2. sliceC will have length of 1 and capacity 2.

Bear in mind that you simply can't access uninitialized length. So if you try to access index 0 in sliceA then it will panic. Same as sliceB and sliceC, if you attempt to access index 1 then it will panic despite there is enough capacity in sliceC. We will talk more about it later. Now we can move to the second syntax.

In the []T{elems...} syntax you just need to define the element in the slice that you want to make. it's useful if you know ahead of time what value you need to be exist in the slice. If you don't define anything inside of it then it will just be empty slice. Below are the syntax to use it :

sliceA%20:=%20%5B%5Dint%7B%7D%0AsliceB%20:=%20%5B%5Dint%7B1,2,3%7D

In code above, sliceA will have empty data while sliceB will have data 1,2,3 on index 0,1,2. Accessing any index that is outside the slice's data will cause it to panic.

Both of those syntax are pretty straight forward right? But let's look at what go does behind the scene.

func%20makeslice(et%20*_type,%20len,%20cap%20int)%20unsafe.Pointer%20%7B%0A%09mem,%20overflow%20:=%20math.MulUintptr(et.size,%20uintptr(cap))%0A%09if%20overflow%20%7C%7C%20mem%20%3E%20maxAlloc%20%7C%7C%20len%20%3C%200%20%7C%7C%20len%20%3E%20cap%20%7B%0A%09%09//%20NOTE:%20Produce%20a%20'len%20out%20of%20range'%20error%20instead%20of%20a%0A%09%09//%20'cap%20out%20of%20range'%20error%20when%20someone%20does%20make(%5B%5DT,%20bignumber).%0A%09%09//%20'cap%20out%20of%20range'%20is%20true%20too,%20but%20since%20the%20cap%20is%20only%20being%0A%09%09//%20supplied%20implicitly,%20saying%20len%20is%20clearer.%0A%09%09//%20See%20golang.org/issue/4085.%0A%09%09mem,%20overflow%20:=%20math.MulUintptr(et.size,%20uintptr(len))%0A%09%09if%20overflow%20%7C%7C%20mem%20%3E%20maxAlloc%20%7C%7C%20len%20%3C%200%20%7B%0A%09%09%09panicmakeslicelen()%0A%09%09%7D%0A%09%09panicmakeslicecap()%0A%09%7D%0A%0A%09return%20mallocgc(mem,%20et,%20true)%0A%7D

Well, looking at the code above we know that go will try to make a slice that can accommodate storing N number of data based on the cap.  You can also see some of the check that golang perform such as whether or not len is negative, and whether or not the cap is below than len.

How to access slice

After we learn how to make a slice, we can start learning how to access it. There is only one way to access the slice which to use the index operator which is []. However, there is some condition where a slice can be safely accessed. Some of it are listed below :

  • You can only access element that is within defined length.
  • You can only access non nil slice.

If you don't follow the rules above, you might encounter a panic. Anyway the syntax to access a slice is as simple as variable[index] . You can think that doing so will de-reference the underlying data (or array). You can directly modify the de-referenced data if you wish by modifying the value directly.

//%20Playground%20link%20:%20https://go.dev/play/p/HAa-8ygCZpo%0Apackage%20main%0A%0Aimport%20%22fmt%22%0A%0Afunc%20main()%20%7B%0A%09sliceA%20:=%20%5B%5Dint%7B1,%202,%203,%204%7D%0A%09fmt.Println(sliceA%5B1%5D)%20//%20Will%20output%202%0A%7D

Remember in the previous paragraph that slice is backed by an array right? So why don't we try to access the array value it self? Tough this is highly discouraged to be performed in production code, but this might help you understand how slice indexer work. Let's do that experiment shall we?

//%20Playground%20link%20:%20https://go.dev/play/p/1JLWLSy_tLe%0Apackage%20main%0A%0Aimport%20(%0A%09%22fmt%22%0A%09%22unsafe%22%0A)%0A%0Afunc%20main()%20%7B%0A%09sliceA%20:=%20%5B%5Dint64%7B1,%202,%203,%204,%205%7D%0A%09sliceAPtr%20:=%20unsafe.Pointer(&sliceA)%0A%09arrayPtr%20:=%20*(*uintptr)(sliceAPtr)%0A%09//%20We%20tried%20to%20access%20the%20fourth%20element%20via%20pointer%20arithmetic%0A%09indexOffset%20:=%203%20*%208%0A%09elementPtr%20:=%20arrayPtr%20+%20uintptr(indexOffset)%0A%09fmt.Println(sliceA%5B3%5D,%20sliceAPtr,%20unsafe.Pointer(arrayPtr),%20unsafe.Pointer(elementPtr),%20*(*int64)(unsafe.Pointer(elementPtr)))%0A%09//%20Output%20:%204%200xc00000c030%200xc000100030%200xc000100048%204%0A%7D%0A%0A

Let's examine the above code. In this code we try to obtain the array inside the slice. Since we know from the struct definition in the previous paragraph, we know that the array pointer will reside in the first byte in the memory. Based on this knowledge we can just directly de-reference the pointer to get the first pointer to the array (stored as arrayPtr). After we got the underlying pointer to array, we can perform pointer arithmetic to get the N th element that we need (In this case it's the fourth element). This is possible to be performed because remember in the definition that array contains contiguous element, which meant N th element is just an offset of another element. So to get the fourth element we just need to add 3 times 8 (since int64 is 8 bytes) to the array's first address. And tada, we have proven that slice use array in the background.

How to insert elements into slice

After we learned how to access it, we can start learning on how to insert into slice. Prepare your seat belt (if you have one) as this topic is very important and might cause performance issues if not done properly.

To insert a new element to the slice there is a special built in syntax that we use, and that is append.  This syntax is so easy to use that you can actually misuse it. Anyway here is an example on how to insert an element to the array :

//%20Playground%20link%20:%20https://go.dev/play/p/h484Ox4SNz1%0Apackage%20main%0A%0Aimport%20%22fmt%22%0A%0Afunc%20main()%20%7B%0A%09sliceA%20:=%20%5B%5Dint%7B1,%202,%203,%204%7D%0A%09sliceA%20=%20append(sliceA,%205)%0A%09fmt.Println(sliceA)%0A%09//%20Output%20:%20%5B1%202%203%204%205%5D%0A%7D%0A

Pretty simple right? Well, the devil is in the detail. Well, remember that slice is backed by an array right? (We just talks about it in the previous paragraph). Well turns out that array isn't that flexible. Golang has to perform reallocation when the capacity is exceed limit that the array can hold. Let's do some experiment to see the phenomenon.

//%20Playground%20link%20:%20https://go.dev/play/p/4vCS5x5GZfT%0Apackage%20main%0A%0Aimport%20(%0A%09%22fmt%22%0A%09%22reflect%22%0A%09%22unsafe%22%0A)%0A%0Afunc%20main()%20%7B%0A%09sliceA%20:=%20%5B%5Dint%7B1,%202,%203,%204%7D%0A%09sliceB%20:=%20sliceA%0A%09sliceB%5B1%5D%20=%20100%0A%09fmt.Println(sliceA,%20sliceB)%0A%09//%20Output%20:%20%5B1%20100%203%204%5D%20%5B1%20100%203%204%5D%0A%09sliceAPtr%20:=%20(*reflect.SliceHeader)(unsafe.Pointer(&sliceA))%0A%09sliceBPtr%20:=%20(*reflect.SliceHeader)(unsafe.Pointer(&sliceB))%0A%09fmt.Println(sliceA,%20%22BACKING%20ARRAY%20ADDR%20FOR%20A:%22,%20sliceAPtr.Data)%0A%09fmt.Println(sliceB,%20%22BACKING%20ARRAY%20ADDR%20FOR%20B:%22,%20sliceBPtr.Data)%0A%09//%20%5B1%20100%203%204%5D%20BACKING%20ARRAY%20ADDR%20FOR%20A:%20824634777600%0A%09//%20%5B1%20100%203%204%5D%20BACKING%20ARRAY%20ADDR%20FOR%20B:%20824634777600%0A%09sliceA%20=%20append(sliceA,%205)%0A%09fmt.Println(sliceA,%20sliceB)%0A%09//%20%5B1%20100%203%204%205%5D%20%5B1%20100%203%204%5D%0A%09fmt.Println(sliceA,%20%22BACKING%20ARRAY%20ADDR%20FOR%20A:%22,%20sliceAPtr.Data)%0A%09fmt.Println(sliceB,%20%22BACKING%20ARRAY%20ADDR%20FOR%20B:%22,%20sliceBPtr.Data)%0A%09//%20%5B1%20100%203%204%205%5D%20BACKING%20ARRAY%20ADDR%20FOR%20A:%20824634785856%0A%09//%20%5B1%20100%203%204%5D%20BACKING%20ARRAY%20ADDR%20FOR%20B:%20824634777600%0A%09sliceB%5B1%5D%20=%2010000%0A%09fmt.Println(sliceA,%20sliceB)%0A%09//%20Output%20:%20%5B1%20100%203%204%205%5D%20%5B1%2010000%203%204%5D%0A%7D%0A

Based on our experiment, we notice here that when we copy sliceA into sliceB we also copied the backing array's address. Any change to sliceB will impact sliceA since both has the same backing array. However, if we insert more data into sliceA which already at max capacity, we notice that sliceA and sliceB now has different backing array. This meant that sliceA is now an entirely new slice and any change from sliceB won't impact sliceA. Let's see golang's source code to find how a reallocation is performed in golang.

//%20Note%20:%20Some%20part%20of%20the%20code%20has%20been%20removed%0Afunc%20growslice(et%20*_type,%20old%20slice,%20cap%20int)%20slice%20%7B%0A%0A...%0A%0A%09newcap%20:=%20old.cap%0A%09doublecap%20:=%20newcap%20+%20newcap%0A%09if%20cap%20%3E%20doublecap%20%7B%0A%09%09newcap%20=%20cap%0A%09%7D%20else%20%7B%0A%09%09if%20old.cap%20%3C%201024%20%7B%0A%09%09%09newcap%20=%20doublecap%0A%09%09%7D%20else%20%7B%0A%09%09%09//%20Check%200%20%3C%20newcap%20to%20detect%20overflow%0A%09%09%09//%20and%20prevent%20an%20infinite%20loop.%0A%09%09%09for%200%20%3C%20newcap%20&&%20newcap%20%3C%20cap%20%7B%0A%09%09%09%09newcap%20+=%20newcap%20/%204%0A%09%09%09%7D%0A%09%09%09//%20Set%20newcap%20to%20the%20requested%20cap%20when%0A%09%09%09//%20the%20newcap%20calculation%20overflowed.%0A%09%09%09if%20newcap%20%3C=%200%20%7B%0A%09%09%09%09newcap%20=%20cap%0A%09%09%09%7D%0A%09%09%7D%0A%09%7D%0A%0A...%0A%09memmove(p,%20old.array,%20lenmem)%0A%0A%09return%20slice%7Bp,%20old.len,%20newcap%7D%0A%7D

From the above code we can see some good info about what happen when slice is growing. We can notice that by default it's double the capacity. When the requested capacity is higher than that then it will just use the highest capacity. For big slice, it will just incrementally increase the capacity by fourth's value of current capacity and if it's overflowed, then it will just use the requested capacity. Also, we can notice that golang will just copy the data from the old array to the new array and make a new slice with current length and with the new capacity.

There is a best practice that we can do to avoid this reallocation. This is named preallocation. We can preallocate by length or by capacity, or even both if you want. Here is the difference that we can make when we perform preallocation.

Name Ops Time/Op Memory Allocation
BenchmarkAppendNormal-4 145998 9040 ns/op 16376 B/op 11 allocs/op
BenchmarkAppendPreallocated-4 667046 1813 ns/op 0 B/op 0 allocs/op

And here is the source code for the benchmark part.

package%20main%0A%0Aimport%20%22testing%22%0A%0Afunc%20BenchmarkAppendNormal(b%20*testing.B)%20%7B%0A%09for%20n%20:=%200;%20n%20%3C%20b.N;%20n++%20%7B%0A%09%09sliceA%20:=%20%5B%5Dint%7B%7D%0A%09%09for%20i%20:=%200;%20i%20%3C%201000;%20i++%20%7B%0A%09%09%09sliceA%20=%20append(sliceA,%20i)%0A%09%09%7D%0A%09%7D%0A%7D%0A%0Afunc%20BenchmarkAppendPreallocated(b%20*testing.B)%20%7B%0A%09for%20n%20:=%200;%20n%20%3C%20b.N;%20n++%20%7B%0A%09%09sliceA%20:=%20make(%5B%5Dint,%200,%201000)%0A%09%09for%20i%20:=%200;%20i%20%3C%201000;%20i++%20%7B%0A%09%09%09sliceA%20=%20append(sliceA,%20i)%0A%09%09%7D%0A%09%7D%0A%7D%0A

Notice that preallocating the slice's capacity significantly speed up append operation. This is because the slice will now use the same array instead of creating many array and then throw it away when no longer used. Here is a simple trick that i usually use to make easier decision whether which part should i preallocate to help with the performance :

  1. If you know the exact number of data that you will use, then use preallocation by length. Example make([]int, 1000)
  2. If you know only the maximum amount of data that you will use, then use preallocation by capacity. Example make([]int,0,1000)
  3. If you don't know any of them, then consider using linked list based data structure.

Anyway that's all from me for now. Hope you find this article useful for your next adventure.

Reference

  • Photo : https://www.pexels.com/photo/cargo-container-lot-906494/
  • Slice Spec : https://go.dev/ref/spec#Slice_types
  • Index Spec : https://go.dev/ref/spec#Index_expressions
  • Slice Definition : https://github.com/golang/go/blob/master/src/runtime/slice.go