みーのぺーじ

みーが趣味でやっているPCやソフトウェアについて.Unity, Python, Processingなどのプログラミングや,脱獄, hackintoshなど

Unicode normalizeのメモ

Unicode 正規化形式を示す "NFC","NFD","NFKC","NFKD" の違いをよく忘れるので,メモします.

import unicodedata
>>> unicodedata.normalize("NFD",",")
','
>>> unicodedata.normalize("NFC",",")
','
>>> unicodedata.normalize("NFKD",",")
','
>>> unicodedata.normalize("NFKC",",")
','
>>> unicodedata.normalize("NFD","ダ")
'ダ'
>>> unicodedata.normalize("NFC","ダ")
'ダ'
>>> unicodedata.normalize("NFKD","ダ")
'ダ'
>>> unicodedata.normalize("NFKC","ダ")
'ダ'

DはDecomposition(分解),CはComposition(合成),KはCompatibility(互換性)のこと.