Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Java SystemClipboard contains additional bytes

I have to following setting: Ubuntu 12.04, Mathematica 9 and IntelliJIDEA 12. Every time I copy some text from Mathematica and paste it into IDEA, there are a lot of additional bytes at the end of the pasted text. What first appeared to be a bug in IDEA seems now rather be a bug in java itself. I have appended a minimal java example which shows the behavior.

Therefore, when I type Plot inside Mathematica, select and copy it, and then run the example I get the following output where the first line is the printed form and the second line are the bytes:

enter image description here

As you can see the Plot is followed by a 0 byte and some other, not necessarily zero, bytes. Throughout all of my tests, I found that a valid solution is to use the string until the first 0 is found, but that does not solve the underlying problem. I really want to see this fixed, because I often copy code between Mathematica and IntelliJIDEA, but first I need to know who to blame for this.

Question:

How can I find out whether Mathematica or Java is the doing something wrong here? I can copy Mathematica content to different editors, browsers, etc and I never saw something like this. On the other hand, I never found IntelliJ (Java) copying waste either. What is a good way to find out whether Mathematica is using the clipboard wrong or Java has a bug?

Minimal example

Select some text in Mathematica, press Ctrl+C and run the following

import java.awt.*;
import java.awt.datatransfer.Clipboard;
import java.awt.datatransfer.DataFlavor;

public class CopyPasteTest {

  public static void main(String[] args) {
    final String text;
    try {
      final Clipboard systemClipboard =
        Toolkit.getDefaultToolkit().getSystemClipboard();
      text = (String) systemClipboard.getData(DataFlavor.stringFlavor);
      System.out.println(text);
      for (byte a : text.getBytes()) {
        System.out.print(a + " ");
      }
    } catch (Exception e) {
      e.printStackTrace();
    }
  }
}

Further information requested in comments

Could just take a look at the clipboard contents after the copy-from-Mathematica operation?

Sure. Unfortunately it returns absolutely nothing. When I mark and copy the following something from the browser for instance, like "this here" I get

patrick@lenerd:~$ xclip -out | hexdump -C
00000000  74 68 69 73 20 68 65 72  65                       |this here|
00000009

Edit

I tried the following things where I used always the same copied "Plot" string from Mathematica. First of all, I tried the larger test-class from David as suggested in his comment. With both, the Oracle JRE and the OpenJRE that comes with Ubuntu I got the following output:

===========
Plot[00][7f][00][00]
===========
Obtained transferrable of type sun.awt.datatransfer.ClipboardTransferable
Plot[00][7f][00][00]
===========

My short sniped from above gives the same result (although not in hex representation). Then I tried the different selections from xclip and using the value clipboard brought the following up

patrick@lenerd:~$ xclip -o -verbose -selection clipboard | hexdump -C
Connected to X server.
Using selection: XA_CLIPBOARD
Using UTF8_STRING.
00000000  50 6c 6f 74 00 00 00 00                           |Plot....|
00000008

Important to note, when I don't use verbose output with xclip, I only see "Plot" in the terminal. Above, you see that there are exactly 4 more bytes in the buffer which are probably not shown, because they start with a 00. Additionally, the extra for bytes are 00 00 00 00, at least this is what is displayed. In java we have a 7f (or 127) at second position.

I guess this all suggests that the bug comes from Mathematica since it copies additional stuff in the buffer and Java is just a bit sloppy because it doesn't cut at the first 00.

like image 901
halirutan Avatar asked Nov 12 '13 09:11

halirutan


1 Answers

These conclusions look sound.

If found the following references about behaviour of the X clipboard:

X11r6 Inter-Client Communication Conventions Manual, in particular Peer-to-Peer Communication by Means of Selections, and also a more compressed explanation (and Python test tools) at Developer’s corner: copy-paste in Linux

Thus, the data "Plot[00][7f][00][00]" or maybe "Plot[00][00][00][00]" is the data that is actually provided by Mathematica on request to the application that "reads" the clipboard. I can only imagine that Mathematica says "here is the string with eight bytes" and the reading application tries to deal with this, reading past the end of the actual character array.

It could also be a bug in X (but Ubuntu 12.04 doesn't use Mir yet, so probably not.)

Note that in Java Strings are not NUL-terminated and "Plot[00][7f][00][00]" is a valid string indeed.

A quick glance at the source of xclip (obtained with yumdownloader --source xclip on my Fedora) seems to reveal that it just calls XFetchBuffer or memcpy (not fully sure) to obtain bytes, then calls fwrite on those, so it will happily write the NULs to the output.

like image 69
David Tonhofer Avatar answered Nov 08 '22 03:11

David Tonhofer