1

I'm building a trigram search using PostreSQL extension pg_trgm. I want the search to be case insensitive and to ignore accents. I've tried to achieve that by using this collation:

CREATE COLLATION "und-for-search" (provider = icu, locale = 'und-u-kn-true-ks-level1', deterministic = false);

When I test this collation on a simple comparison, it seems to work ok:

SELECT 'z' = 'ž' COLLATE "und-for-search" -- return true ✅

But in trigram search 'z', and 'ž' are not treated as equal:

SELECT similarity('z' COLLATE "und-for-search", 'ž' COLLATE "und-for-search") -- returns 0 ❌

If we check the trigrams being generated via SELECT show_trgm('ž' COLLATE "und-for-search"), which returns:

| show_trgm |
| - |
| {0xe0bfa7,0xfd71f4} |

Can somebody explain why the collation is not working correctly with trigrams? Is there any way to run trigram search that ignores accents without additional functions like unaccent?

1
  • Yo; I guess you'll have to use unaccent(). Commented Apr 2 at 7:32

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.