I have an HTML page that I generate from the data contained in a database. The database sometimes contains long strings that the browser can't break because the strings don't contain breakable characters (space, point, comma, etc...).
Is there any way to fix this using html, css or even javascript?
See this link for an example of the problem.
Yes you can, just set the css property of the box to:
.some_selector {
    word-wrap: break-word;
}
Edit: Some testing shows that it does work with a div or a p - a block level element - but it does not work with a table cell, nor when the div is put inside a table cell.
Tested and works in IE6, IE7, IE8, Firefox 3.5.3 and Chrome.
Works:
<div style="word-wrap: break-word">aaaaaaaaaaaaaaaaaaaaaaddddddddddddddddddddddddddddddddddddddddddaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa </div>
Based on this article and this one as well: the "Shy Hyphen" or "Soft Hyphen" can be written in HTML as: ­ / ­ / ­ (173 dec = AD hex).  They all convert to the U+00AD character.
The JavaScript textContent and nodeValue of the DOM Text Nodes are not 'entity encoded' - they just contain the actual entities.  In order to write these characters you must therefore encode them yourself: \xAD is a simple way to write the same character in a JavaScript string.  String.fromCharCode(173) would also work.
Based on your own VERY good answer - a jQuery Plugin version:
$.fn.replaceInText = function(oldText, newText) {
  // contents() gets all child dom nodes -- each lets us operate on them
  this.contents().each(function() {
    if (this.nodeType == 3) { // text node found, do the replacement
        if (this.textContent) {
            this.textContent = this.textContent.replace(oldText, newText);
        } else { // support to IE
            this.nodeValue = this.nodeValue.replace(oldText, newText);
        }
    } else {
      // other types of nodes - scan them for same replace
      $(this).replaceInText(oldText, newText);
    }
  });
  return this;
};
$(function() {
    $('div').replaceInText(/\w{10}/g, "$&\xAD");
});
A side note:
I think that the place this should happen is NOT in JavaScript - it should be in the server side code. If this is only a page used to display data- you could easily do a similar regexp replace on the text before it is sent to the browser. However the JavaScript solution offers one advantage(or disadvantage depending on how you want to look at it) - It doesn't add any extraneous characters to the data until the script executes, which means any robots crawling your HTML output for data wont see the shy hyphens. Although the HTML spec interprets it as a "hyphenation hint" and an invisible character its not guaranteed across the rest of the Unicode world: (quote from Unicode standard via the second article I linked)
U+00AD soft hyphen indicates a hyphenation point, where a line-break is preferred when a word is to be hyphenated. Depending on the script, the visible rendering of this character when a line break occurs may differ (for example, in some scripts it is rendered as a hyphen -, while in others it may be invisible).
Another Note:
Found in this other SO Question - it seems that the "Zero Width Space" character ​ / ​ / U+200b is another option you might want to explore.  It would be \x20\x0b as a javascript string.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With