Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there an "invisible" hyphen character in Unicode / HTML?

Tags:

html

css

unicode

I've found the soft hyphen character (U+00AD SHY) very useful but I am wondering if there is the same thing that will tell the browser where to break long words for wrapping without adding any character at all?

For example, let's say you have a narrow column in HTML with newspaper justification and there is a long URL explicitly in the text itself. You could add the soft/shy hyphen I mentioned but then when a user copy and pastes the URL it will contain those dash characters. An ideal situation would be the same visual results without a hyphen character so that the user may copy and paste the long word(s).

Thoughts or suggestions?

I tried searching for this but most of what I come up with is non-breaking space characters and essentially I am looking for the opposite.

UPDATE: I found the ZERO-WIDTH SPACE (U+200B) but it still has the problem that the character is preserved during copy&paste into the address bar so the results are even more confusing to the end user.

like image 677
Neil C. Obremski Avatar asked Nov 15 '16 16:11

Neil C. Obremski


People also ask

Is a dash a Unicode character?

The en dash is encoded in Unicode as U+2013 (decimal 8211) and represented in HTML by the named character entity – .

Is em dash a UTF 8 character?

"End of guarded area" encoded in utf-8 is the two-byte sequence: 0xC2 0x97. The text file was correctly interpreted as w-1252, thus the 0x97 is recognized as em dash, which was correctly encoded as the em dash in utf-8: 0xE2 0x80 0x94.

What does a soft hyphen look like?

The WWP encodes soft hyphens (hyphens which mark the division of am unhyphenated word across a line break) using an entity reference: “­” This entity reference is used for any character which is used to indicate a word break, whatever it looks like.

How do you break words with a hyphen in CSS?

hyphens: manual Words are only broken at line breaks where there are characters inside the word that suggest line break opportunities. There are two characters that suggest line break opportunity: U+2010 (HYPHEN): the “hard” hyphen character indicates a visible line break opportunity.


1 Answers

You want the HTML5 tag <wbr>, which is specified to do exactly what you are asking for.

If you can't rely on HTML5, U+200B ZERO WIDTH SPACE (&#8203;) should also work.

(The effects of copying text out of an HTML document, unfortunately, are underspecified. If <wbr> doesn't do what you want upon copy-and-paste, you might want to bring it up to the WhatWG — the easiest way to do that is probably to file a Github issue on the spec.)

like image 170
zwol Avatar answered Sep 30 '22 04:09

zwol