Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling grapheme clusters in Dart

From what I can tell Dart does not have support for grapheme clusters, though there is talk of supporting it:

  • Dart Strings should support Unicode grapheme cluster operations #34
  • Minimal Unicode grapheme cluster support #49

Until it is implemented, what are my options for iterating through grapheme clusters? For example, if I have a string like this:

String family = '\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}'; // 👨‍👩‍👧
String myString = 'Let me introduce my $family to you.';

and there is a cursor after the five-codepoint family emoji:

enter image description here

How would I move the cursor one user-perceived character to the left?

(In this particular case I know the size of the grapheme cluster so I could do it, but what I am really asking about is finding the length of an arbitrarily long grapheme cluster.)

Update

I see from this article that Swift uses the system's ICU library. Something similar may be possible in Flutter.

Supplemental code

For those who want to play around with my example above, here is a demo project. The buttons move the cursor to the right or left. It currently takes 8 button presses to move the cursor past the family emoji.

enter image description here

main.dart

import 'package:flutter/material.dart';

void main() => runApp(MyApp());

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(title: Text('Grapheme cluster testing')),
        body: BodyWidget(),
      ),
    );
  }
}

class BodyWidget extends StatefulWidget {
  @override
  _BodyWidgetState createState() => _BodyWidgetState();
}

class _BodyWidgetState extends State<BodyWidget> {

  TextEditingController controller = TextEditingController(
      text: 'Let me introduce my \u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467} to you.'
  );

  @override
  Widget build(BuildContext context) {
    return Column(
      children: <Widget>[
        TextField(
          controller: controller,
        ),
        Row(
          children: <Widget>[
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('<<'),
                onPressed: () {
                  _moveCursorLeft();
                },
              ),
            ),
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('>>'),
                onPressed: () {
                  _moveCursorRight();
                },
              ),
            ),
          ],
        )
      ],
    );
  }

  void _moveCursorLeft() {
    int currentCursorPosition = controller.selection.start;
    if (currentCursorPosition == 0)
      return;
    int newPosition = currentCursorPosition - 1;
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }

  void _moveCursorRight() {
    int currentCursorPosition = controller.selection.end;
    if (currentCursorPosition == controller.text.length)
      return;
    int newPosition = currentCursorPosition + 1;
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }
}
like image 707
Suragch Avatar asked Feb 01 '19 16:02

Suragch


People also ask

How do you add special characters in flutter?

How do you add special characters to a string in flutter? Escaping every special character. Like many programming languages, you can escape special characters in a Dart string by using “\” (backslash). Display text as a raw string.

What is extended grapheme cluster?

An extended grapheme cluster is a group of one or more Unicode scalar values that approximates a single user-perceived character. Many individual characters, such as “é”, “김”, and “🇮🇳”, can be made up of multiple Unicode scalar values.

What are grapheme clusters?

A grapheme cluster is a sequence of one or more Unicode code points that should be treated as a single unit by various processes: Text-editing software should generally allow placement of the cursor only at grapheme cluster boundaries.


2 Answers

Update: use https://pub.dartlang.org/packages/icu

Sample code:

import 'package:flutter/material.dart';


import 'dart:async';
import 'package:icu/icu.dart';

void main() => runApp(MyApp());

class MyApp extends StatelessWidget {
  @override
  Widget build(BuildContext context) {
    return MaterialApp(
      home: Scaffold(
        appBar: AppBar(title: Text('Grapheme cluster testing')),
        body: BodyWidget(),
      ),
    );
  }
}

class BodyWidget extends StatefulWidget {
  @override
  _BodyWidgetState createState() => _BodyWidgetState();
}

class _BodyWidgetState extends State<BodyWidget> {
  final ICUString icuText = ICUString('Let me introduce my \u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467} to you.\u{1F468}\u{200D}\u{1F469}\u{200D}\u{1F467}');
  TextEditingController controller;
  _BodyWidgetState() {
    controller = TextEditingController(
      text: icuText.toString()
  );
  }

  @override
  Widget build(BuildContext context) {
    return Column(
      children: <Widget>[
        TextField(
          controller: controller,
        ),
        Row(
          children: <Widget>[
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('<<'),
                onPressed: () async {
                  await _moveCursorLeft();
                },
              ),
            ),
            Padding(
              padding: const EdgeInsets.all(8.0),
              child: RaisedButton(
                child: Text('>>'),
                onPressed: () async {
                  await _moveCursorRight();
                },
              ),
            ),
          ],
        )
      ],
    );
  }

  void _moveCursorLeft() async {
    int currentCursorPosition = controller.selection.start;
    if (currentCursorPosition == 0)
      return;
    int newPosition = await icuText.previousGraphemePosition(currentCursorPosition);
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }

  void _moveCursorRight() async {
    int currentCursorPosition = controller.selection.end;
    if (currentCursorPosition == controller.text.length)
      return;
    int newPosition = await icuText.nextGraphemePosition(currentCursorPosition);
    controller.selection = TextSelection(baseOffset: newPosition, extentOffset: newPosition);
  }
}


Original answer:

Until Dart/Flutter fully implements ICU, I think your best bet is to use PlatformChannel to pass the Unicode string native (iOS Swift4+ or Android Java/Kotlin) to iterate/manupuliate there, and send back the result.

  • For Swift4+, it's out-of-the-box as the article you mention (not Swift3-, not ObjC)
  • For Java/Kotlin, replace Oracle's BreakIterator with ICU library's, which works much better. No changes aside from import statements.

The reason I suggest to use native manipulation (instead of doing it on Dart) is because Unicode has too many things to handle, such as normalization, canonical equivalence, ZWNJ, ZWJ, ZWSP, etc.

Comment down if you need some sample code.

like image 81
TruongSinh Avatar answered Oct 09 '22 14:10

TruongSinh


2020 update

Use the characters package by the Dart team. It's now the official way to handle grapheme clusters.

Use text.characters to get the grapheme clusters. User text.characters.iterator to move over them. I'm still working out how to convert CharacterRange to TextSelection. I'll update this answer later when I have more details.

Note: This is a complete rewrite of my old answer. See the edit history for details.

like image 44
Suragch Avatar answered Oct 09 '22 15:10

Suragch