Differences between revisions 1 and 2
Revision 1 as of 2020-10-17 01:06:08
Size: 541
Comment:
Revision 2 as of 2023-01-08 05:51:08
Size: 552
Comment:
Deletions are marked like this. Additions are marked like this.
Line 11: Line 11:
The bytes of a Golang string may or may not be valid Unicode characters. There are only a few instances where strings are implicitly encoded: A `string` is an immutable container of `uint8` bytes. They are meant to represent text encoded in UTF-8, but not every byte is guaranteed to be valid Unicode.

There are only a few instances where strings are implicitly encoded:
Line 14: Line 16:
 * in the conversion `[]rune(s)`, whole string is decoded to Unicode runes  * in `rs := []rune(s)`, `rs` is a string of Unicode runes
Line 16: Line 18:
In these cases, all invalid Unicode bytes are converted to `U+FFFD` (replacement character). Go will not crash when writing, printing, etc., an invalid Unicode byte. In these cases, all invalid Unicode bytes are converted to `U+FFFD` (the replacement character).


Go Strings


Unicode

A string is an immutable container of uint8 bytes. They are meant to represent text encoded in UTF-8, but not every byte is guaranteed to be valid Unicode.

There are only a few instances where strings are implicitly encoded:

  • in for i, r := range s, the r is a Unicode rune

  • in rs := []rune(s), rs is a string of Unicode runes

In these cases, all invalid Unicode bytes are converted to U+FFFD (the replacement character).


CategoryRicottone

Go/Strings (last edited 2023-01-08 05:56:16 by DominicRicottone)