65 lines
3.8 KiB
Text
65 lines
3.8 KiB
Text
|
|
||
|
=== Unicode Considerations for Databases
|
||
|
|
||
|
Database schema in {{book.project.name}} only accounts for Unicode strings in the following special fields:
|
||
|
|
||
|
* Realms: display name, HTML display name
|
||
|
* Federation Providers: display name
|
||
|
* Users: username, given name, last name, attribute names and values
|
||
|
* Groups: name, attribute names and values
|
||
|
* Roles: name
|
||
|
* Descriptions of objects
|
||
|
|
||
|
Otherwise, characters are limited to those contained in database encoding which is often 8-bit. However, for some
|
||
|
database systems, it is possible to enable UTF-8 encoding of Unicode characters and use full Unicode character set in all
|
||
|
text fields. Often, this is counterbalanced by shorter maximum length of the strings than in case of 8-bit encodings.
|
||
|
|
||
|
Some of the databases require special settings to database and/or JDBC driver to be able to handle Unicode characters.
|
||
|
Please find the settings for your database below. Note that if a database is listed here, it can still work properly
|
||
|
provided it handles UTF-8 encoding properly both on the level of database and JDBC driver.
|
||
|
|
||
|
Technically, the key criterion for Unicode support for all fields is whether the database allows setting of Unicode
|
||
|
character set for `VARCHAR` and `CHAR` fields. If yes, there is a high chance that Unicode will be plausible, usually at
|
||
|
the expense of field length. If it only supports Unicode in `NVARCHAR` and `NCHAR` fields, Unicode support for all text
|
||
|
fields is unlikely as Keycloak schema uses `VARCHAR` and `CHAR` fields extensively.
|
||
|
|
||
|
==== Oracle Database
|
||
|
|
||
|
Unicode characters are properly handled provided the database was created with Unicode support in `VARCHAR` and `CHAR`
|
||
|
fields (e.g. by using `AL32UTF8` character set as the database character set). No special settings is needed for JDBC
|
||
|
driver.
|
||
|
|
||
|
If the database character set is not Unicode, then to use Unicode characters in the special fields, the JDBC driver needs
|
||
|
to be configured with the connection property `oracle.jdbc.defaultNChar` set to `true`. It might be wise, though not
|
||
|
strictly necessary, to also set the `oracle.jdbc.convertNcharLiterals` connection property to `true`. These properties
|
||
|
can be set either as system properties or as connection properties. Please note that setting `oracle.jdbc.defaultNChar`
|
||
|
may have negative impact on performance. For details, please refer to Oracle JDBC driver configuration documentation.
|
||
|
|
||
|
==== Microsoft SQL Server Database
|
||
|
|
||
|
Unicode characters are properly handled only for the special fields. No special settings of JDBC driver or database is
|
||
|
necessary.
|
||
|
|
||
|
==== IBM DB2 Database
|
||
|
|
||
|
Unicode characters are properly handled for all fields, length reduction applies to non-special fields. No special
|
||
|
settings of JDBC driver or database is necessary.
|
||
|
|
||
|
==== MySQL Database
|
||
|
|
||
|
Unicode characters are properly handled provided the database was created with Unicode support in `VARCHAR` and `CHAR`
|
||
|
fields in the `CREATE DATABASE` command (e.g. by using `utf8` character set as the default database character set in
|
||
|
MySQL 5.5. Please note that `utf8mb4` character set does not work due to different storage requirements to `utf8`
|
||
|
character set footnote:[Tracked as https://issues.jboss.org/browse/KEYCLOAK-3873]). Note that in this case, length
|
||
|
restriction to non-special fields does not apply because columns are created to accomodate given amount of characters,
|
||
|
not bytes. If the database default character set does not allow storing Unicode, only the special fields allow storing
|
||
|
Unicode values.
|
||
|
|
||
|
At the side of JDBC driver settings, it is necessary to add a connection property `characterEncoding=UTF-8` to the JDBC
|
||
|
connection settings.
|
||
|
|
||
|
==== PostgreSQL Database
|
||
|
|
||
|
Unicode is supported when the database character set is `UTF8`. In that case, Unicode characters can be used in any
|
||
|
field, there is no reduction of field length for non-special fields. No special settings of JDBC driver is necessary.
|