Does Go have any method or there is a suggestion how to check if a string contains only ASCII characters? What is the right way to do it?
From my research, one of the solution is to check whatever there is any char greater than 127.
func isASCII(s string) bool {
for _, c := range s {
if c > unicode.MaxASCII {
return false
}
}
return true
}
str , bytes , and bytearray gained support for the new isascii() method, which can be used to test if a string or bytes contain only the ASCII characters.
Check if a string contains only ASCII: str.isascii() returns True if all characters in the string are ASCII characters (U+0000 - U+007F). Symbols such as + and - are also determined as True . Hiragana, etc., which are not ASCII, are determined as False .
Just paste your ASCII text in the input area and you will instantly get the ASCII status in the output area. If the input contains only ASCII characters, you'll get a green badge, otherwise a red badge. Fast, free, and without ads. Import ASCII โ get ASCII status.
isascii() will check if the strings is ascii. "\x03". isascii() is also True.
In Go, we care about performance, Therefore, we would benchmark your code:
func isASCII(s string) bool {
for _, c := range s {
if c > unicode.MaxASCII {
return false
}
}
return true
}
BenchmarkRange-4 20000000 82.0 ns/op
A faster (better, more idiomatic) version, which avoids unnecessary rune conversions:
func isASCII(s string) bool {
for i := 0; i < len(s); i++ {
if s[i] > unicode.MaxASCII {
return false
}
}
return true
}
BenchmarkIndex-4 30000000 55.4 ns/op
ascii_test.go
:
package main
import (
"testing"
"unicode"
)
func isASCIIRange(s string) bool {
for _, c := range s {
if c > unicode.MaxASCII {
return false
}
}
return true
}
func BenchmarkRange(b *testing.B) {
str := ascii()
b.ResetTimer()
for N := 0; N < b.N; N++ {
is := isASCIIRange(str)
if !is {
b.Fatal("notASCII")
}
}
}
func isASCIIIndex(s string) bool {
for i := 0; i < len(s); i++ {
if s[i] > unicode.MaxASCII {
return false
}
}
return true
}
func BenchmarkIndex(b *testing.B) {
str := ascii()
b.ResetTimer()
for N := 0; N < b.N; N++ {
is := isASCIIIndex(str)
if !is {
b.Log("notASCII")
}
}
}
func ascii() string {
byt := make([]byte, unicode.MaxASCII+1)
for i := range byt {
byt[i] = byte(i)
}
return string(byt)
}
Output:
$ go test ascii_test.go -bench=.
BenchmarkRange-4 20000000 82.0 ns/op
BenchmarkIndex-4 30000000 55.4 ns/op
$
Another option:
package main
import "golang.org/x/exp/utf8string"
func main() {
{
b := utf8string.NewString("south north").IsASCII()
println(b) // true
}
{
b := utf8string.NewString("๐งก๐๐๐๐").IsASCII()
println(b) // false
}
}
https://pkg.go.dev/golang.org/x/exp/utf8string#String.IsASCII
It looks like your way is best.
ASCII is simply defined as:
ASCII encodes 128 specified characters into seven-bit integers
As such, characters have values 0-27 (or 0-127, 0x0-0x7F).
Go provides no way to check that every rune in a string (or byte in a slice) has numerical values in a specific range, so your code seems to be the best way to do it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With