[$] Working with UTF-8 in the kernel

corbet

from LWN.net on 2019-03-28 17:34 (#4C11S)

In the real world, text is expressed in many languages using a wide varietyof character sets; those character sets can be encoded in a lot ofdifferent ways. In the kernel, life has always been simpler; file namesand other string data are just opaque streams of bytes. In the few caseswhere the kernel must interpret text, nothing more than ASCII is required.The proposed addition of case-insensitivefile-name lookups to the ext4 filesystem changes things, though; nowsome kernel code must deal with the full complexity of Unicode. A look at the API being providedto handle encodings illustrates nicely just how complicated this task is.

Source	RSS or Atom Feed
Feed Location	http://lwn.net/headlines/rss
Feed Title	LWN.net
Feed Link	https://lwn.net/