Join us

Strings in Golang

1_fH36BWLc3Dh8vqtnxFiy-Q.jpeg

Golang strings are immutable.

In general, immutable data is simpler to reason about, but it also means your program must allocate more memory to “change” that data. Sometimes, your program can’t afford that luxury. For example, there might not be any more memory to allocate. Another reason: you don’t want to create more work for the garbage collector.

In C, a string is a null-terminated sequence of chars — char*. Each char is a single byte, and the string keeps going until there’s a '\0' character. If you pointed at an arbitrary memory location and called it a C string, you’d see every byte in order until you hit a zero.

In Go, string is its own data type. At its core, it’s still a sequence of bytes, but:

  • It’s a fixed length. It doesn’t just continue until a zero appears.
  • It comes with extra information: its length.
  • “Characters” or runes may span multiple bytes.
  • It’s immutable.

So string in Go carries some additional structure compared to char* in C. How does it do this? It’s actually a struct:

Data here is analogous to the C string, and Len is the length. The Golang struct memory layout starts with the last field, so if you were to look at a string under the microscope, you’d see Len first and then a pointer to the string's contents. (You can find documentation of these header structs in the reflect package.)

Before we start inspecting strings by looking at their StringHeader fields, how do we cast a string to a StringHeader in the first place? When you really need to convert from one Go type to another, use the unsafe package:

unsafe.Pointer is an untyped pointer. It can point to any kind of value. It’s a way to tell the compiler, “Step aside. I know what I’m doing.” In this case, what we’re doing is converting a *string into an unsafe.Pointer into a *StringHeader.

Now we have access to the underlying representation of the string. Ever wondered how len("hello") works? We can implement it ourselves:

Getting the length of a string is nice, but what about setting it? Here’s what happens if we artificially extend the length of a string:

By changing the Len field of the string header, we can expand the string to include other parts of memory. It’s interesting to observe this behavior, but it’s not something you’d actually want to use.

Data: unsafe.Pointer

You may have noticed that StringHeader has an unsafe.Pointer field which points to the string’s sequence of bytes. []byte also has a sequence of bytes. In fact, we can build a []byte from this pointer. Here’s what a slice actually looks like:

It’s a lot like StringHeader, except it also has a Cap (capacity) field. What happens if we build a SliceHeader from the fields of a StringHeader?

We’ve converted a string into a []byte. It’s just as easy to go the other direction:

Both string and []byte headers are using the same Data pointer, so they share memory. If you ever need to convert between string and []byte but there isn’t enough memory to perform a copy, this might be useful.

A word of caution, however: string is meant to be immutable, but []byte is not. If you cast a string to []byte and try to modify the byte array, it’s a segmentation fault.

Casting in the other direction doesn’t cause a segmentation fault, but then your supposedly immutable string can change:


Only registered users can post comments. Please, login or signup.

Start blogging about your favorite technologies, reach more readers and earn rewards!

Join other developers and claim your FAUN account now!

Avatar

Kalpit Sharma

Connectwise

@kalpit-sharma-dev
Golang Developer - Software & Stuff @Connectwise @AWS ❀ @Golang SDE2 Working with Go, Linux, & Web Things, Microservices ◆ Rest Api đŸ‘©â€đŸ’»
User Popularity
37

Influence

3k

Total Hits

1

Posts