Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting Hibernate and SQL Server to play nice with VARCHAR and NVARCHAR


I'm currently in the process of enabling UTF-8 characters in some tables of a large database. These tables are already of MS-SQL type NVARCHAR. Additionally, I have several fields using VARCHAR as well.

There is a well known issue with Hibernate's interactions with the JDBC driver (see e.g., Mapping to varchar and nvarchar in hibernate) . In short, Hibernate/JDBC generates SQL that passes all strings as Unicode, regardless of the underlying SQL type. When a non-unicode (varchar) field in the database is compared to a Unicode input string, the indicies for that column do not match the encoding so a full table scan is performed. In the JDBC driver (both JTDS and MS versions) there is a parameter to pass Unicode strings as ASCII, but this is an all or nothing proposition that disallows international characters from being input into the the database.

Most posts I've seen on this issue have come up with one of two solutions - 1) change everything in the database to NVARCHAR or 2) set the sendStringParametersAsUnicode=false, My question then is this - is there any known solution for having VARCHAR and NVARCHAR play nicely together? It is a huge issue for my environment to change everything to NVARCHAR because of downstream dependencies and other external issues.

like image 930
dfb Avatar asked Mar 08 '11 19:03

dfb


People also ask

Which is better varchar or NVARCHAR in SQL?

The key difference between varchar and nvarchar is the way they are stored, varchar is stored as regular 8-bit data(1 byte per character) and nvarchar stores data at 2 bytes per character. Due to this reason, nvarchar can hold upto 4000 characters and it takes double the space as SQL varchar.

Is varchar faster than NVARCHAR?

Each character of an nvarchar column requires 2 bytes of storage whereas a varchar column requires 1 byte per character. Potentially, varchar will be quicker but that may well mean that you cannot store the data that you need.

Should I use Nchar or NVARCHAR?

If your column will store a fixed-length Unicode characters like French, Arabic and so on characters then go for NCHAR. If the data stored in a column is Unicode and can vary in length, then go for NVARCHAR.


1 Answers

public class SQLServerUnicodeDialect extends org.hibernate.dialect.SQLServerDialect {
    public SQLServerUnicodeDialect() {
        super();
        registerColumnType(Types.CHAR, "nchar(1)");
        registerColumnType(Types.LONGVARCHAR, "nvarchar(max)" );
        registerColumnType(Types.VARCHAR, 4000, "nvarchar($l)");
        registerColumnType(Types.VARCHAR, "nvarchar(max)");
        registerColumnType(Types.CLOB, "nvarchar(max)" );

        registerColumnType(Types.NCHAR, "nchar(1)");
        registerColumnType(Types.LONGNVARCHAR, "nvarchar(max)");
        registerColumnType(Types.NVARCHAR, 4000, "nvarchar($l)");
        registerColumnType(Types.NVARCHAR, "nvarchar(max)");
        registerColumnType(Types.NCLOB, "nvarchar(max)");

        registerHibernateType(Types.NCHAR, StandardBasicTypes.CHARACTER.getName());
        registerHibernateType(Types.LONGNVARCHAR, StandardBasicTypes.TEXT.getName());
        registerHibernateType(Types.NVARCHAR, StandardBasicTypes.STRING.getName());
        registerHibernateType(Types.NCLOB, StandardBasicTypes.CLOB.getName() );
    }
}
like image 189
urbanq Avatar answered Oct 24 '22 06:10

urbanq