This parameter allows you to specify how to normalize Chinese, Japanese, and Korean data before extraction, in all Eduction components.
You can specify the value of CJKNormalization
as follows:
Kana
. Half width kana to full width kana.
OldNew
. Old kanji to new kanji.
Number
. Chinese or kanji number characters to ASCII number characters.
HWNum
. Full width number characters to ASCII number characters.
HWAlpha
. Full width alphabet characters to ASCII alphabet characters.
SimpChi
. Traditional Chinese to simplified Chinese.
FWJamo
. Half width jamo to full width jamo.
Separate multiple options with a comma.
Type: | String |
Default: |
None |
Required: | No |
Configuration Section: | Any section that you have defined for Eduction settings. |
Example: | CJKNormalization=SimpChi,Kana
|
See Also: |
|