Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace newlines in a Unicode string

I am trying to replace newline characters in a unicode string and seem to be missing some magic codes.

My particular example is that I am working on AppEngine and trying to put titles from HTML pages into a db.StringProperty() in my model.

So I do something like:

link.title = unicode(page_title,"utf-8").replace('\n','').replace('\r','')

and I get:

Property title is not multi-line

Are there other codes I should be using for the replace?

like image 674
Jackson Miller Avatar asked Feb 04 '10 17:02

Jackson Miller


People also ask

What is Unicode for newline?

LF (character : \n, Unicode : U+000A, ASCII : 10, hex : 0x0a): This is simply the '\n' character which we all know from our early programming days. This character is commonly known as the 'Line Feed' or 'Newline Character'.

How to type new line character?

In Windows, a new line is denoted using “\r\n”, sometimes called a Carriage Return and Line Feed, or CRLF.

What is a new line character?

A newline is a character used to represent the end of a line of text and the beginning of a new line. With early computers, an ASCII code was created to represent a new line because all text was on one line.


2 Answers

Try ''.join(unicode(page_title, 'utf-8').splitlines()). splitlines() should let the standard library take care of all the possible crazy Unicode line breaks, and then you just join them all back together with the empty string to get a single-line version.

like image 94
Hank Gay Avatar answered Oct 14 '22 01:10

Hank Gay


Python uses these characters for splitting in unicode.splitlines():

  • U+000A LINE FEED (\n)
  • U+000D CARRIAGE RETURN (\r)
  • U+001C FILE SEPARATOR
  • U+001D GROUP SEPARATOR
  • U+001E RECORD SEPARATOR
  • U+0085 NEXT LINE
  • U+2028 LINE SEPARATOR
  • U+2029 PARAGRAPH SEPARATOR

As Hank says, using splitlines() will let Python take care of all of the details for you, but if you need to do it manually, then this should be the complete list.

like image 42
Ian Clelland Avatar answered Oct 14 '22 03:10

Ian Clelland