⇤ ← Revision 1 as of 2020-10-17 01:06:08
Size: 541
Comment:
|
Size: 552
Comment:
|
Deletions are marked like this. | Additions are marked like this. |
Line 11: | Line 11: |
The bytes of a Golang string may or may not be valid Unicode characters. There are only a few instances where strings are implicitly encoded: | A `string` is an immutable container of `uint8` bytes. They are meant to represent text encoded in UTF-8, but not every byte is guaranteed to be valid Unicode. There are only a few instances where strings are implicitly encoded: |
Line 14: | Line 16: |
* in the conversion `[]rune(s)`, whole string is decoded to Unicode runes | * in `rs := []rune(s)`, `rs` is a string of Unicode runes |
Line 16: | Line 18: |
In these cases, all invalid Unicode bytes are converted to `U+FFFD` (replacement character). Go will not crash when writing, printing, etc., an invalid Unicode byte. | In these cases, all invalid Unicode bytes are converted to `U+FFFD` (the replacement character). |
Go Strings
Contents
Unicode
A string is an immutable container of uint8 bytes. They are meant to represent text encoded in UTF-8, but not every byte is guaranteed to be valid Unicode.
There are only a few instances where strings are implicitly encoded:
in for i, r := range s, the r is a Unicode rune
in rs := []rune(s), rs is a string of Unicode runes
In these cases, all invalid Unicode bytes are converted to U+FFFD (the replacement character).