We know we can print each character in UTF8 code units? Then, if we have code units of these characters, how can we create a String with them?
In order to convert a String into UTF-8, we use the getBytes() method in Java. The getBytes() method encodes a String into a sequence of bytes and returns a byte array. where charsetName is the specific charset by which the String is encoded into an array of bytes.
To convert an Int value to a String value in Swift, use String(). String() accepts integer as argument and returns a String value created using the given integer value.
Swift 5 switches the preferred encoding of strings from UTF-16 to UTF-8 while preserving efficient Objective-C-interoperability.
UTF-8 encodes a character into a binary string of one, two, three, or four bytes. UTF-16 encodes a Unicode character into a string of either two or four bytes. This distinction is evident from their names.
It's possible to convert UTF8 code points to a Swift String idiomatically using the UTF8 Swift class. Although it's much easier to convert from String to UTF8!
import Foundation  public class UTF8Encoding {   public static func encode(bytes: Array<UInt8>) -> String {     var encodedString = ""     var decoder = UTF8()     var generator = bytes.generate()     var finished: Bool = false     do {       let decodingResult = decoder.decode(&generator)       switch decodingResult {       case .Result(let char):         encodedString.append(char)       case .EmptyInput:         finished = true       /* ignore errors and unexpected values */       case .Error:         finished = true       default:         finished = true       }     } while (!finished)     return encodedString   }    public static func decode(str: String) -> Array<UInt8> {     var decodedBytes = Array<UInt8>()     for b in str.utf8 {       decodedBytes.append(b)     }     return decodedBytes   } }  func testUTF8Encoding() {   let testString = "A UTF8 String With Special Characters: 😀🍎"   let decodedArray = UTF8Encoding.decode(testString)   let encodedString = UTF8Encoding.encode(decodedArray)   XCTAssert(encodedString == testString, "UTF8Encoding is lossless: \(encodedString) != \(testString)") }   Of the other alternatives suggested:
Using NSString invokes the Objective-C bridge;
Using UnicodeScalar is error-prone because it converts UnicodeScalars directly to Characters, ignoring complex grapheme clusters; and
Using String.fromCString is potentially unsafe as it uses pointers.
With Swift 5, you can choose one of the following ways in order to convert a collection of UTF-8 code units into a string.
String's init(_:) initializerIf you have a String.UTF8View instance (i.e. a collection of UTF-8 code units) and want to convert it to a string, you can use init(_:) initializer. init(_:) has the following declaration:
init(_ utf8: String.UTF8View)   Creates a string corresponding to the given sequence of UTF-8 code units.
The Playground sample code below shows how to use init(_:):
let string = "Café 🇫🇷" let utf8View: String.UTF8View = string.utf8  let newString = String(utf8View) print(newString) // prints: Café 🇫🇷   Swift's init(decoding:as:) initializerinit(decoding:as:) creates a string from the given Unicode code units collection in the specified encoding:
let string = "Café 🇫🇷" let codeUnits: [Unicode.UTF8.CodeUnit] = Array(string.utf8)  let newString = String(decoding: codeUnits, as: UTF8.self) print(newString) // prints: Café 🇫🇷   Note that init(decoding:as:) also works with String.UTF8View parameter:
let string = "Café 🇫🇷" let utf8View: String.UTF8View = string.utf8  let newString = String(decoding: utf8View, as: UTF8.self) print(newString) // prints: Café 🇫🇷   transcode(_:from:to:stoppingOnError:into:) functionThe following example transcodes the UTF-8 representation of an initial string into Unicode scalar values (UTF-32 code units) that can be used to build a new string:
let string = "Café 🇫🇷" let bytes = Array(string.utf8)  var newString = "" _ = transcode(bytes.makeIterator(), from: UTF8.self, to: UTF32.self, stoppingOnError: true, into: {     newString.append(String(Unicode.Scalar($0)!)) }) print(newString) // prints: Café 🇫🇷   Array's withUnsafeBufferPointer(_:) method and String's init(cString:) initializerinit(cString:) has the following declaration:
init(cString: UnsafePointer<CChar>)   Creates a new string by copying the null-terminated UTF-8 data referenced by the given pointer.
The following example shows how to use init(cString:) with a pointer to the content of a CChar array (i.e. a well-formed UTF-8 code unit sequence) in order to create a string from it:
let bytes: [CChar] = [67, 97, 102, -61, -87, 32, -16, -97, -121, -85, -16, -97, -121, -73, 0]  let newString = bytes.withUnsafeBufferPointer({ (bufferPointer: UnsafeBufferPointer<CChar>)in     return String(cString: bufferPointer.baseAddress!) }) print(newString) // prints: Café 🇫🇷   Unicode.UTF8's decode(_:) methodTo decode a code unit sequence, call decode(_:) repeatedly until it returns UnicodeDecodingResult.emptyInput:
let string = "Café 🇫🇷" let codeUnits = Array(string.utf8)  var codeUnitIterator = codeUnits.makeIterator() var utf8Decoder = Unicode.UTF8() var newString = ""  Decode: while true {     switch utf8Decoder.decode(&codeUnitIterator) {     case .scalarValue(let value):         newString.append(Character(Unicode.Scalar(value)))     case .emptyInput:         break Decode     case .error:         print("Decoding error")         break Decode     } }  print(newString) // prints: Café 🇫🇷   String's init(bytes:encoding:) initializerFoundation gives String a init(bytes:encoding:) initializer that you can use as indicated in the Playground sample code below:
import Foundation  let string = "Café 🇫🇷" let bytes: [Unicode.UTF8.CodeUnit] = Array(string.utf8)  let newString = String(bytes: bytes, encoding: String.Encoding.utf8) print(String(describing: newString)) // prints: Optional("Café 🇫🇷") 
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With