Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Tkinter and 32-bit Unicode duplicating – any fix?

I only want to show Chip, but I get both Chip AND Dale. It doesn't seem to matter which 32 bit character I put in, tkinter seems to duplicate them - it's not just chipmunks.

I'm thinking that I may have to render them to png and then place them as images, but that seems a bit ... heavy-handed.

Any other solutions? Is tkinter planning on fixing this?

import tkinter as tk

# Python 3.8.3
class Application(tk.Frame):
    def __init__(self, master=None):
        self.canvas = None
        self.quit_button = None
        tk.Frame.__init__(self, master)
        self.grid()
        self.create_widgets()

    def create_widgets(self):
        self.canvas = tk.Canvas(self, width=500, height=420, bg='yellow')
        self.canvas.create_text(250, 200, font="* 180", text='\U0001F43F')
        self.canvas.grid()

        self.quit_button = tk.Button(self, text='Quit', command=self.quit)
        self.quit_button.grid()

app = Application()
app.master.title('Emoji')
app.mainloop()

Chip and Dale on Mac OS

  • Apparently this works fine on Windows - so maybe it’s a MacOS issue.
  • I've run it on two separate Mac - both of them on the latest OS Catalina 10.15.5 - and both show the problem
  • The bug shows with the standard Python installer from python.org - Python 3.8.3 with Tcl/Tk 8.6.8
  • Supposedly it might be fixed with Tcl/Tk 8.6.10 - but I don't really see how I can upgrade Tcl/Tk using the normal installer.
  • This is also reported as a bug cf. https://bugs.python.org/issue41212

One of the python contributors believes that TCL/Tk can-not/will-not support variable width encoding (it always internally converts fixed width encoding) which indicates to me that Tcl/Tk is not suitable for general UTF-8 development.

like image 521
Konchog Avatar asked Jul 03 '20 10:07

Konchog


2 Answers

The fundamental problem is that Tcl and Tk are not very happy with non-BMP (Unicode Basic Multilingual Plane) characters. Prior to 8.6.10, what happens is anyone's guess; the implementation simply assumed such characters didn't exist and was known to be buggy when they actually turned up (there's several tickets on various aspects of this). 8.7 will have stronger fixes in place (see TIP #389 for the details) — the basic aim is that if you feed non-BMP characters in, they can be got out at the other side so they can be written to a UTF-8 file or displayed by Tk if the font engine deigns to support them — but some operations will still be wrong as the string implementation will still be using surrogates. 9.0 will fix things properly (by changing the fundamental character storage unit to be large enough to accommodate any Unicode codepoint) but that's a disruptive change.

With released versions, if you can get the surrogates over the wall from Python to Tcl, they'll probably end up in the GUI engine which might do the right thing. In some cases (not including any build I've currently got, FWIW, but I've got strange builds so don't read very much into that). With 8.7, sending over UTF-8 will be able to work; that's part of the functionality profile that will be guaranteed. (The encoding functions exist in older versions, but with 8.6 releases they will do the wrong thing with non-BMP UTF-8 and break weirdly with older versions than that.)

like image 190
Donal Fellows Avatar answered Nov 08 '22 13:11

Donal Fellows


The problem

Several things could have happened:
  • That is what the emoji is. There is no way to fix it, except change the source emoji.
  • Tk and/or Tcl are confused with the emoji. This means that it isn't sure what emoji to put, so it puts 2 chipmunks. When I tried that emoji on my Linux computer, it threw an error.

The solution

The only solution may be to save the emoji as a file, then create an image. But there could be other, slightly more complicated ways. For example, you could create a rectangle of Frame over the second chipmunk to hide it.
like image 1
cs1349459 Avatar answered Nov 08 '22 12:11

cs1349459