CaseFolding

Name

CaseFolding -- choose alternative case mapping

indexer.conf search.htm

Synopsis

CaseFolding {default | turkish | turkish2}

Description

When storing word information to the database, indexer converts the words to lower case. Some languages can have special rules for case mapping. With CaseFolding set to turkish, indexer applies special rules when converting to lower case: U+0049 LATIN CAPITAL LETTER I is mapped to U+0131 LATIN SMALL LETTER DOTLESS I, and U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE is mapped to U+0069 LATIN SMALL LETTER I, which is suitable for Turkish and Azerbaijani languages.

With CaseFolding set to turkish2, the letters U+0049 LATIN CAPITAL LETTER I, U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE and U+0131 LATIN SMALL LETTER DOTLESS I are mapped to U+0069 LATIN SMALL LETTER I. which is suitable for indexing Turkish and English sites at the same time.

With CaseFolding set to default, indexer applies "traditional" lower case mapping rules, i.e. U+0049 LATIN CAPITAL LETTER I is mapped to U+0069 LATIN SMALL LETTER I, while U+0130 LATIN CAPITAL LETTER I WITH DOT ABOVE doesn't change.

Note: indexer.conf and search.htm should set CaseFolding to the same value.

Examples


CaseFolding turkish