I have a document A in encoding A displayed in tool A and a document B in encoding B displayed in tool B. If I cut and paste (part of) B into A what might be the resultant character encoding? I realise this depends on tool A and tool B and the information held in the paste buffer (which presumably can contain an encoding?) and the operating system.
What should high-quality tools do? and in practice how many of the common tools (e.g. Word, TextPad, various IDEs, etc.) do a good job?
Problem. Computers store text as a sequence of numbers where each character has a unique number according to an agreed upon "character encoding standard". The problem is that there are many standards and each standard assigns different numbers to the same character.
A character encoding provides a key to unlock (ie. crack) the code. It is a set of mappings between the bytes in the computer and the characters in the character set. Without the key, the data looks like garbage.
Clipboard holds a copy and then at a later time pastes it into the receiving app. Clipboard does not change character encoding or other attributes on the way by. Old special characters might cause problems either source or receiving app, so you should update these characters.
These encoding systems also conflict with one another. That is, two encodings can use the same number for two different characters, or use different numbers for the same character.
First of all, a text editor's internal representation of text has no bearing on how the text is encoded (serialized) when you save the file. So a document is not "in" an encoding; it's a sequence of abstract characters. When the document is saved to a file (or transmitted over the network) then it gets encoded.
It's up to each application to decide what it puts on the clipboard. Typically, a windows app that knows what it's doing will put a number of different representations on the clipboard. When you paste in the other app, the app will look for the representation that best suits its need.
In your case, a text editor (that knows what it's doing) will put a Unicode representation of a selected string onto the clipboard (where Unicode, in Windows, is typically moved around as UTF-16, but that's not important). When you paste in the other app, it will insert that sequence of Unicode characters into the document at the selection point.
There's an app floating around called "ClipSpy" that will help you see what I'm talking about, interactively.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With