Are null bytes allowed in unicode strings? I don't ask about utf8, I mean the high level object representation of a unicode string. Background We store unicode strings containing null bytes via Python in PostgreSQL. The strings cut at the null byte if we read it again.

About the database side, PostgreSQL itself does not allow null byte (<code>'\0'</code>) in a string on char/text/varchar fields, so if you try to store a string containing it you receive an error. Example: <pre class="prettyprint"><code>postgres=# SELECT convert_from('foo\000bar'::bytea, 'unicode'); ERROR: 22021: invalid byte sequence for encoding "UTF8": 0x00 </code></pre> If you really need to store such information, then you can use <code>bytea</code> data type on PostgreSQL side. Make to sure to encode it correctly.

Are null bytes allowed in unicode strings in PostgreSQL via Python?

2 Answers

About the database side, PostgreSQL itself does not allow null byte ('\0') in a string on char/text/varchar fields, so if you try to store a string containing it you receive an error. Example:

postgres=# SELECT convert_from('foo\000bar'::bytea, 'unicode');
ERROR:  22021: invalid byte sequence for encoding "UTF8": 0x00

If you really need to store such information, then you can use bytea data type on PostgreSQL side. Make to sure to encode it correctly.

answered Oct 29 '22 22:10

MatheusOl

Python itself is perfectly capable of having both byte strings and Unicode strings with null characters having a value of zero. However if you call out to a library implemented in C, that library may use the C convention of stopping at the first null character.

answered Oct 29 '22 22:10

Mark Ransom

Related questions
                            
                                Django: Migrations depend on removed 3rd-party module
                            
                                wxPython in Python 3.4.1
                            
                                How to use QComboBox as delegate with QTableView
                            
                                How do I install Numpy for Python 2.7 on Windows?
                            
                                How to find current QLocale in Qt/PyQt/PySide?
                            
                                Why does python VM have co_names instead of just using co_consts?
                            
                                Python md5 hashes of same gzipped file are inconsistent
                            
                                python points to global installation even after virtualenv activation
                            
                                How to prevent PyDev's autopep8 import formatter from moving site.addsitedir() calls?
                            
                                PySide Qt tr() does not translate, translate() does - context wrong?
                            
                                Comparing two lists of coordinates in python and using coordinate values to assign values
                            
                                Python Curses - module 'curses' has no attribute 'LINES'
                            
                                Using Python to Resize Images when Greater than 1280 on either side
                            
                                Numpy: get 1D array as 2D array without reshape
                            
                                Python Pyramid periodic task
                            
                                Understanding A* heuristics for single goal maze
                            
                                Share choices across Django apps
                            
                                label matplotlib imshow axes with strings
                            
                                cython: memory view of ndarray of strings (or direct ndarray indexing)
                            
                                Django compilemessages doesn't create .mo files

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Are null bytes allowed in unicode strings in PostgreSQL via Python?

Tags:

python

postgresql

unicode

guettli

People also ask

2 Answers

MatheusOl

Mark Ransom

Recent Activity

Donate For Us