W3Basic Logo

Go Rune

A rune is an alias for 32-bit integer values. They represent Unicode codepoints. A Unicode code point is an integer value that uniquely represents the character.

For example, the rune literal of 'a' is actually the number 97.


Runes

In old programming languages, such as C, there is no difference between a character and a byte, that is char and byte are the same type. As a reminder, a byte is a sequence of 8 bits, whose value can be between 0 and 255.

However, in Go, there is the concept of a rune which is a character that may be represented by more than one byte. For example, the character é is represented by two bytes: 0xc3 and 0xa9. This is a trick used to represent characters that are not in the original ASCII table (the characters used by American computers in the 1960s).

Thus a slice of bytes and a slice of runes are not the same thing.

var s string = "é"
fmt.Println(len(s)) // 2
var s string = "é"
fmt.Println(len([]rune(s))) // 1

Rune literals

A rune literal is a single character enclosed in single quotes, for example 'Ñ' or '€'. You can also use the \u escape sequence to represent a rune, for example \u00A1 or \u03B2 in case you don't have the character on your keyboard.

They can be asign to a variable of type rune:

var r rune = 'Ñ'

The len() function

Note that the example in the introduction to determine the length of a string only works with strings that can be converted to runes, that is strings of a single (possibly multi-byte) character.

Finding the length of rune

Here is the correct way to determine the length of a string in runes:

package main

import (
	"fmt"
	"unicode/utf8"
)

func main() {
	fmt.Println(utf8.RuneCountInString("Hello"))     // 5
	fmt.Println(utf8.RuneCountInString("Hello, 世界")) // 9

}

Output

5
9

© 2023 W3Basic. All rights reserved.

Follow Us: