If this blog post interests you, there’s a good chance you work in tech. Given the nature of our work and the effects of the internet, many of us can have an outsized impact on people’s lives & livelihoods. This is an important moment and also an opportunity to examine, challenge, and change problematic parts of our society. And for us in tech, it is critical that we take this moment to better the work we do and how we do it. Please familiarise yourself with ways you can help & actions you can take.
Go: there is no free. There is only garbage collection.
– Rob Pike
As with all languages, memory management inevitably becomes a concern when writing performant Go programs. And because the language specification covers garbage collection and memory allocation, there is a high cost for bypassing the standard memory management components. Particular care is needed to avoid violating the Go memory model. As a result, well designed Go programs tend to focus on minimizing garbage creation instead of optimizing garbage collection.
There are a few common techniques for minimizing garbage and they each address different types of garbage. So, typically, the best approach is to first identify the type of garbage before reaching for one of these solutions.
Short lived objects are often a big source of garbage in transactional
programs. The allocation cost of some temporary objects can be amortized by
pooling and reusing them. A
the standard way to pool temporary objects, however it’s not a magic
and it’s important to understand the
Be sure to always zero out memory returned to a pool!
Intermediate slices are slices allocated and returned by a function call, and
typically the data is read/copied only once and then left unused. A useful
technique for eliminating these intermediate slices is to add a destination
argument (typically the first argument) to the function, so
By allowing the caller of the function to specify the destination, the callee function can write directly to the destination slice and eliminate an extra copy.
Slice reallocations are another common source of garbage. During an
when the combined new and old data exceeds the capacity of a slice, a new slice
(and backing array) is allocated and the data is copied from old to new.
These reallocations can often be avoided by initially allocating the destination slice with enough capacity to hold the finished data. Typically, it’s better to over allocate the initial capacity if doing so can prevent possible reallocations.
These sources of garbage are especially common when converting between internal & external representations of data, such as object serialization/deserialization and working with network protocols.
When encoding messages with header data before the payload data, particularly variable
length messages, the payload is often constructed first followed by the header
data. This often requires extra allocating and copying to combine the header
and payload data. Smart use of
append and slice capacity can minimize this
overhead, but extra bookkeeping is still required because in order to
append to “prepend” slice data.
github.com/benburkert/padded package aims to eliminate the cognitive
overhead required to efficiently prepend data to the front of a slice.
padded package provides a padded byte slice type. The padding is a
scratch space allocated in front of the slice and behaves roughly the same as
the capacity. The slice type is conceptually a hybrid of
sk_buff in the Linux kernel and the
sds string encoding: internally they are
represented the same as regular byte slices, and are safe to pass to functions
that read but don’t modify the slice.
Before explaining how the padded slice works, it’s important to first
understand how regular slices and
Let’s start by looking at how a regular byte slice works:
s := make(byte, 4, 6) ┌────cap────┐ ┌───────────┐ │0|0|0|0|0|0│ └───────────┘ └──len──┘
Allocating the byte slice
s with a length of
4 and capacity of
6 gives us
a contiguous block of memory (i.e. a backing array) with each cell initialized
to the zero value (
byte values). As Russ explains in his blog
s is really a struct that holds a
pointer to the beginning of the backing array, along with a length & capacity value.
s = append(s, 1) ┌────cap────┐ ┌───────────┐ │0|0|0|0|1|0│ └───────────┘ └───len───┘
s manipulates both the data in the backing array and the length
padded.Make function adds an extra argument for the padding size of the new slice.
s := padded.Make(4, 6, 2) ┌─pad─┬────cap────┐ ┌─────────────────┐ │p|p|p|0|0|0|0|0|0│ └─────────────────┘ └──len──┘
s is a padded slice with a length of
4, capacity of
6, and padding of
2. The value of
p depends on the position of the cell in the padding: each
p value is the number of cells of remaining padding. The first
p always has
a value of
0 to indicate the left end of the backing array.
Because the padding is encoded in the cell data (unlike the length & capacity
of a slice, which are struct fields), a
padded.Slice is just a
cap, and slicing works the same.
The size of each
p value’s cell is the size of an integer for the machine
8 bytes on 64bit machines and
4 bytes on 32bit machines. In
this example, each
p value is one byte in length, but in reality they will
span multiple bytes in the backing array.
pvalues span multiple bytes, prepending a number of bytes that is not a multiple of the integer size causes the prepend to fall back to a regular
s = padded.Prepend(s, 1) ┌pad┬────cap──────┐ s = padded.Append(s, 2, 3) ┌─────────────────┐ │p|p|1|0|0|0|0|3|4│ └─────────────────┘ └─────len─────┘
Prepend call places the
1 into the first spot in
s, then updates
s points to the location of the
1, and increases the capacity &
padded package can give a big speed boost to
Prepend operations when
used well with appropriate slack space. (Check out the package’s benchmarks for
a comparison.) And the API can improve the bookkeeping & readability of code
append to prepend. However, there is one major downside to the
The Go language specification covers slices but it does not include a
description of the backing arrays. So the trick used in the
padded package to
grow into unreferenced parts of a backing array is not covered by the language
spec. At the moment, this is not a problem because the garbage collector will
not sweep unreferenced parts of a backing array. However, in the future, the
implementation of the garbage collector and/or allocator may change and break
this assumption. Be aware that by using the
padded package you may be
sacrificing compatibility with future versions (and alternate implementations)
of the Go runtime. It’s for this reason that I have only ever used this
package once in production code, even though prepending to a byte slice is
something I do regularly in the code I write.