Notes for using Chinese LIWC
1. Text files saved in UTF-8 format
Currently, the text files must be saved in UTF8 format.
2. The Chinese words have to be segmented by space first
Since Chinese sentences need to be segmented in order to be processed by LIWC. For traditional Chinese, the segmentation can be done by CKIPS. However, we recommend the Stanford Word Segmenter for the simplified Chinese segmentation.
3. Transform full form punctuation into half form
The segmented files need to be further processed regarding the transformations of punctuations from full-width to semi-width font in order to be recognized by LIWC program. This procedure is critical otherwise the calculation of total word count and percentages will be mistaken. The related utilities could be downloaded from here.
Currently, the text files must be saved in UTF8 format.
2. The Chinese words have to be segmented by space first
Since Chinese sentences need to be segmented in order to be processed by LIWC. For traditional Chinese, the segmentation can be done by CKIPS. However, we recommend the Stanford Word Segmenter for the simplified Chinese segmentation.
3. Transform full form punctuation into half form
The segmented files need to be further processed regarding the transformations of punctuations from full-width to semi-width font in order to be recognized by LIWC program. This procedure is critical otherwise the calculation of total word count and percentages will be mistaken. The related utilities could be downloaded from here.