Ensure that you use UTF8 encoding whenever it is applicable.
For example, in your site/config.php check the following settings:
- $config->pageNameCharset = 'UTF8';
- $config->pageNameWhitelist = '... set it properly ...'; // the default is not good enough for some charsets
- $config->dbCharset = 'utf8';
- setlocale(LC_ALL, 'en_US.UTF-8');
Also set the .htaccess file if needed. See this post.
You can check the results by issuing these commands (using e.g. Tracy Debugger's console):
- $str = 'éáőúóüö'; // put your custom chars here
- d($str);
- d($sanitizer->pageNameUTF8($str, true));
In addition to choosing utf8 encoding you also have to set proper database collation.
The default is usually utf8_general_ci which handles texts if they were all ASCII and have no accents which is bad.
Check the collation:
- SHOW VARIABLES LIKE 'collation%';
- SELECT TABLE_NAME, COLUMN_NAME, COLLATION_NAME FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_SCHEMA='your-pw-db-name';
Choose and set the proper collation (e.g. uca1400_....as_ci for accent sensitive and case insensitive data handling) for field tables and columns. (PhpMyAdmin might be a good choice for this task.)