Value | Meaning |
---|---|
UTF8PROC_NULLTERM1 << 0 | The given UTF-8 input is NULL terminated. |
UTF8PROC_STABLE1 << 1 | Unicode Versioning Stability has to be respected. |
UTF8PROC_COMPAT1 << 2 | Compatibility decomposition (i.e. formatting information is lost). |
UTF8PROC_COMPOSE1 << 3 | Return a result with decomposed characters. |
UTF8PROC_DECOMPOSE1 << 4 | Return a result with decomposed characters. |
UTF8PROC_IGNORE1 << 5 | Strip "default ignorable characters" such as SOFT-HYPHEN or ZERO-WIDTH-SPACE. |
UTF8PROC_REJECTNA1 << 6 | Return an error, if the input contains unassigned codepoints. |
UTF8PROC_NLF2LS1 << 7 | Indicating that NLF-sequences (LF, CRLF, CR, NEL) are representing a line break, and should be converted to the codepoint for line separation (LS). |
UTF8PROC_NLF2PS1 << 8 | Indicating that NLF-sequences are representing a paragraph break, and should be converted to the codepoint for paragraph separation (PS). |
UTF8PROC_NLF2LFUTF8PROC_NLF2LS | UTF8PROC_NLF2PS | Indicating that the meaning of NLF-sequences is unknown. |
UTF8PROC_STRIPCC1 << 9 | Strips and/or convers control characters. NLF-sequences are transformed into space, except if one of the NLF2LS/PS/LF options is given. HorizontalTab (HT) and FormFeed (FF) are treated as a NLF-sequence in this case. All other control characters are simply removed. |
UTF8PROC_CASEFOLD1 << 10 | Performs unicode case folding, to be able to do a case-insensitive string comparison. |
UTF8PROC_CHARBOUND1 << 11 | Inserts 0xFF bytes at the beginning of each sequence which is representing a single grapheme cluster (see UAX#29). |
UTF8PROC_LUMP1 << 12 | Lumps certain characters together. E.g. HYPHEN U+2010 and MINUS U+2212 to ASCII "-". See lump.md for details. If NLF2LF is set, this includes a transformation of paragraph and line separators to ASCII line-feed (LF). |
UTF8PROC_STRIPMARK1 << 13 | Strips all character markings. This includes non-spacing, spacing and enclosing (i.e. accents). @note This option works only with @ref UTF8PROC_COMPOSE or @ref UTF8PROC_DECOMPOSE |
UTF8PROC_STRIPNA1 << 14 | Strip unassigned codepoints. |