70 lines
4 KiB
Text
70 lines
4 KiB
Text
|
|
=== Unicode Considerations for Databases
|
|
|
|
Database schema in {project_name} only accounts for Unicode strings in the following special fields:
|
|
|
|
* Realms: display name, HTML display name, localization texts (keys and values)
|
|
* Federation Providers: display name
|
|
* Users: username, given name, last name, attribute names and values
|
|
* Groups: name, attribute names and values
|
|
* Roles: name
|
|
* Descriptions of objects
|
|
|
|
Otherwise, characters are limited to those contained in database encoding which is often 8-bit. However, for some
|
|
database systems, it is possible to enable UTF-8 encoding of Unicode characters and use full Unicode character set in all
|
|
text fields. Often, this is counterbalanced by shorter maximum length of the strings than in case of 8-bit encodings.
|
|
|
|
Some of the databases require special settings to database and/or JDBC driver to be able to handle Unicode characters.
|
|
Please find the settings for your database below. Note that if a database is listed here, it can still work properly
|
|
provided it handles UTF-8 encoding properly both on the level of database and JDBC driver.
|
|
|
|
Technically, the key criterion for Unicode support for all fields is whether the database allows setting of Unicode
|
|
character set for `VARCHAR` and `CHAR` fields. If yes, there is a high chance that Unicode will be plausible, usually at
|
|
the expense of field length. If it only supports Unicode in `NVARCHAR` and `NCHAR` fields, Unicode support for all text
|
|
fields is unlikely as Keycloak schema uses `VARCHAR` and `CHAR` fields extensively.
|
|
|
|
==== Oracle Database
|
|
|
|
Unicode characters are properly handled provided the database was created with Unicode support in `VARCHAR` and `CHAR`
|
|
fields (e.g. by using `AL32UTF8` character set as the database character set). No special settings is needed for JDBC
|
|
driver.
|
|
|
|
If the database character set is not Unicode, then to use Unicode characters in the special fields, the JDBC driver needs
|
|
to be configured with the connection property `oracle.jdbc.defaultNChar` set to `true`. It might be wise, though not
|
|
strictly necessary, to also set the `oracle.jdbc.convertNcharLiterals` connection property to `true`. These properties
|
|
can be set either as system properties or as connection properties. Please note that setting `oracle.jdbc.defaultNChar`
|
|
may have negative impact on performance. For details, please refer to Oracle JDBC driver configuration documentation.
|
|
|
|
==== Microsoft SQL Server Database
|
|
|
|
Unicode characters are properly handled only for the special fields. No special settings of JDBC driver or database is
|
|
necessary.
|
|
|
|
==== MySQL Database
|
|
|
|
Unicode characters are properly handled provided the database was created with Unicode support in `VARCHAR` and `CHAR`
|
|
fields in the `CREATE DATABASE` command (e.g. by using `utf8` character set as the default database character set in
|
|
MySQL 5.5. Please note that `utf8mb4` character set does not work due to different storage requirements to `utf8`
|
|
character set footnote:[Tracked as https://issues.redhat.com/browse/KEYCLOAK-3873]). Note that in this case, length
|
|
restriction to non-special fields does not apply because columns are created to accommodate given amount of characters,
|
|
not bytes. If the database default character set does not allow storing Unicode, only the special fields allow storing
|
|
Unicode values.
|
|
|
|
At the side of JDBC driver settings, it is necessary to add a connection property `characterEncoding=UTF-8` to the JDBC
|
|
connection settings.
|
|
|
|
==== PostgreSQL Database
|
|
|
|
Unicode is supported when the database character set is `UTF8`. In that case, Unicode characters can be used in any
|
|
field, there is no reduction of field length for non-special fields. No special settings of JDBC driver is necessary.
|
|
|
|
The character set of a PostgreSQL database is determined at the time it is created. You can determine the default
|
|
character set for a PostgreSQL cluster with the SQL command
|
|
```SQL
|
|
show server_encoding;
|
|
```
|
|
|
|
If the default character set is not UTF 8, then you can create the database with UTF8 as its character set like this:
|
|
```SQL
|
|
create database keycloak with encoding 'UTF8';
|
|
```
|