Differences between revisions 1 and 7 (spanning 6 versions)
Revision 1 as of 2020-10-17 01:06:08
Size: 541
Comment:
Revision 7 as of 2025-10-10 15:20:38
Size: 4121
Comment: More details
Deletions are marked like this. Additions are marked like this.
Line 2: Line 2:

Strings are a sequence data type. There is also a built-in `strings` module for working with the data type.
Line 9: Line 11:
== Unicode == == Type ==
Line 11: Line 13:
The bytes of a Golang string may or may not be valid Unicode characters. There are only a few instances where strings are implicitly encoded: Strings are an immutable sequence of `uint8` bytes. Like other container types, they can be operated on using:
Line 13: Line 15:
 * in `for i, r := range s`, the `r` is a Unicode rune
 * in the conversion `[]rune(s)`, whole string is decoded to Unicode runes
 * `len()` to count bytes
 * addition operators (`+` and `+=`) for concatenation
 * subscripting (`str[2]`, `str[2:]`, `str[:3]`, and `str[2:3]`)
Line 16: Line 19:
In these cases, all invalid Unicode bytes are converted to `U+FFFD` (replacement character). Go will not crash when writing, printing, etc., an invalid Unicode byte. Bytes are not guaranteed to be valid Unicode, except:

 * in `for _, r := range str`, `r` is a Unicode rune
 * in `rs := []rune(str)`, `rs` is a string of Unicode runes

In these cases, all invalid Unicode bytes are converted to `U+FFFD` (the replacement character).

----



== Module ==



=== Split ===

There are several similar functions for splitting.

||'''Name''' ||'''Meaning''' ||
||`Split(a, b string) []string` ||Split `a` by every instance of `b` ||
||`SplitN(a, b string, n int) []string` ||Split `a` by every instance of `b`, up to `n` times ||
||`SplitAfter(a, b string) []string` ||Split `a` after every instance of `b` ||
||`SplitAfterN(a, b string, n int) []string` ||Split `a` after every instance of `b`, up to `n` times||
||`SplitSeq(a, b string, n int) iter.Seq[string]`||Split `a` by every instance of `b` ||
||`SplitAfterSeq(a, b string) iter.Seq[string]` ||Split `a` after every instance of `b` ||

If `b` is not in `a`, the return value is a slice of length 1 containing `a`. If `b` is empty, then `a` is split between every UTF-8 sequence.

Passing `n` of -1 to `SplitN` is equivalent to `Split`.

To demonstrate the `After` variants:

{{{
strings.Split("a,b,c", ",") // ["a" "b" "c"]
strings.SplitAfter("a,b,c", ",") // ["a," "b," "c"]
}}}

The `Seq` variants are equivalent but return a single-use iterator. This can be more efficient.



=== Replace ===

There are several similar functions for replacement of substrings.

||'''Name''' ||'''Meaning''' ||
||`Replace(a, b, c string, n int) string`||Returns `s` with all instances of `b` replaced with `c`, up to `n` times||
||`ReplaceAll(a, b, c string) string` ||Returns `s` with all instances of `b` replaced with `c` ||

A `Replacer` is used to perform more complicated replacements.

{{{
message := "What??? IMPOSSIBLE!"
replacer := strings.NewReplacer("!","", "?","")
replacer.Replace(strings.ToLower(message)) // "what impossible"
}}}



=== Builder ===

A `Builder` is used to more efficiently compose a large string.

{{{
import (
    "strings"
    "regexp"
    "bufio"
    "fmt"
    "os"
)

var Pattern = regexp.MustCompile(`PING`)

func main() {
    count := 0

    // Initialize the Builder
    var content strings.Builder

    // Initialize the Scanner using STDIN
    scanner := bufio.NewScanner(os.Stdin)

    // Loop over scanned lines
    for scanner.Scan() {
        line := scanner.Text()

        matches := Pattern.FindAllStringIndex(line, -1)

        // Build lines
        for i := len(matches)-1; i >= 0; i-- {
            line = line[:matches[i][0]] + fmt.Sprintf("[%d]", count+i) + line[matches[i][1]:]
        }
        content.WriteString(line)
        content.WriteString("\n")

        count += len(matches)
    }

    // Check for scanner errors
    if err := scanner.Err(); err != nil {
        fmt.Println(err)
    }

    // Print to STDOUT
    fmt.Printf(content.String())
}
}}}

{{{
$ cat test
Hello, this is PING your PING friend.
I am PING testing your work. PING.
PING PING PING test PING.

$ cat test | ./scan-and-build
Hello, this is [0] your [1] friend.
I am [2] testing your work. [3].
[4] [5] [6] test [7].

}}}

----



== See also ==

[[https://pkg.go.dev/strings|strings package]]


Go Strings

Strings are a sequence data type. There is also a built-in strings module for working with the data type.


Type

Strings are an immutable sequence of uint8 bytes. Like other container types, they can be operated on using:

  • len() to count bytes

  • addition operators (+ and +=) for concatenation

  • subscripting (str[2], str[2:], str[:3], and str[2:3])

Bytes are not guaranteed to be valid Unicode, except:

  • in for _, r := range str, r is a Unicode rune

  • in rs := []rune(str), rs is a string of Unicode runes

In these cases, all invalid Unicode bytes are converted to U+FFFD (the replacement character).


Module

Split

There are several similar functions for splitting.

Name

Meaning

Split(a, b string) []string

Split a by every instance of b

SplitN(a, b string, n int) []string

Split a by every instance of b, up to n times

SplitAfter(a, b string) []string

Split a after every instance of b

SplitAfterN(a, b string, n int) []string

Split a after every instance of b, up to n times

SplitSeq(a, b string, n int) iter.Seq[string]

Split a by every instance of b

SplitAfterSeq(a, b string) iter.Seq[string]

Split a after every instance of b

If b is not in a, the return value is a slice of length 1 containing a. If b is empty, then a is split between every UTF-8 sequence.

Passing n of -1 to SplitN is equivalent to Split.

To demonstrate the After variants:

strings.Split("a,b,c", ",")       // ["a" "b" "c"]
strings.SplitAfter("a,b,c", ",")  // ["a," "b," "c"]

The Seq variants are equivalent but return a single-use iterator. This can be more efficient.

Replace

There are several similar functions for replacement of substrings.

Name

Meaning

Replace(a, b, c string, n int) string

Returns s with all instances of b replaced with c, up to n times

ReplaceAll(a, b, c string) string

Returns s with all instances of b replaced with c

A Replacer is used to perform more complicated replacements.

message := "What??? IMPOSSIBLE!"
replacer := strings.NewReplacer("!","", "?","")
replacer.Replace(strings.ToLower(message))       // "what impossible"

Builder

A Builder is used to more efficiently compose a large string.

import (
    "strings"
    "regexp"
    "bufio"
    "fmt"
    "os"
)

var Pattern = regexp.MustCompile(`PING`)

func main() {
    count := 0

    // Initialize the Builder
    var content strings.Builder

    // Initialize the Scanner using STDIN
    scanner := bufio.NewScanner(os.Stdin)

    // Loop over scanned lines
    for scanner.Scan() {
        line := scanner.Text()

        matches := Pattern.FindAllStringIndex(line, -1)

        // Build lines
        for i := len(matches)-1; i >= 0; i-- {
            line = line[:matches[i][0]] + fmt.Sprintf("[%d]", count+i) + line[matches[i][1]:]
        }
        content.WriteString(line)
        content.WriteString("\n")

        count += len(matches)
    }

    // Check for scanner errors
    if err := scanner.Err(); err != nil {
        fmt.Println(err)
    }

    // Print to STDOUT
    fmt.Printf(content.String())
}

$ cat test
Hello, this is PING your PING friend.
I am PING testing your work. PING.
PING PING PING test PING.

$ cat test | ./scan-and-build
Hello, this is [0] your [1] friend.
I am [2] testing your work. [3].
[4] [5] [6] test [7].


See also

strings package


CategoryRicottone

Go/Strings (last edited 2025-10-10 15:20:38 by DominicRicottone)