textstat:文本可读性计算包
textstat是python的文本可读性计算包,可以计算 文章层面、段落层面·句子层面 的文本的音节统计syllable_count词汇数统计lexicon_count句子数统计sentence_count各种可读性算法目前支持的语言有英语en、德语de、西班牙语es、法语fr、意大利语it、荷兰语nl、波兰语pl、俄语ru,目前不支持中文呢。可读性计算方法有The Flesch Reading
textstat是python的文本可读性计算包,可以计算 文章层面、段落层面·句子层面
的文本的
-
音节统计syllable_count
-
词汇数统计lexicon_count
-
句子数统计sentence_count
-
各种可读性算法
目前支持的语言有英语en、德语de、西班牙语es、法语fr、意大利语it、荷兰语nl、波兰语pl、俄语ru,目前不支持中文呢。
可读性计算方法有
-
The Flesch Reading Ease formula
-
Flesch-Kincaid Grade Level
-
The Fog Scale (Gunning FOG Formula)
-
The SMOG Index
-
Automated Readability Index
-
The Coleman-Liau Index
-
Linsear Write Formula
-
Dale-Chall Readability Score
安装
!pip3 install textstat
音节统计
textstat.syllable_count(text)
import textstat
test = 'Playing games'
textstat.syllable_count(test)
Run
3
词汇统计
textstat.lexicon_count(text, removepunct=True)
test2 = "Playing games has always!"
textstat.lexicon_count(test2, removepunct=True)
Run
4
可读性
输入text,返回可读性值。
-
textstat.fleschreadingease(text)
-
textstat.smog_index(text)
-
textstat.fleschkincaidgrade(text)
-
textstat.colemanliauindex(text)
-
textstat.automatedreadabilityindex(text)
-
textstat.dalechallreadability_score(text)
-
textstat.difficult_words(text)
-
textstat.linsearwriteformula(text)
-
textstat.gunning_fog(text)
-
textstat.text_standard(text)
每种算法大家请移步到github项目链接
https://github.com/shivam5992/textstat
查看计算原理及得分的解读。
test_data = "Playing games has always been thought to be important to the development of well-balanced \
and creative children; however, what part, if any, they should play in the lives of \
adults has never been researched that deeply. I believe that playing games is every bit \
as important for adults as for children. Not only is taking time out to play games with our \
children and other adults valuable to building interpersonal relationships but is also a wonderful way \
to release built up tension."
Run
print(textstat.flesch_reading_ease(test_data))
print(textstat.smog_index(test_data))
print(textstat.flesch_kincaid_grade(test_data))
print(textstat.coleman_liau_index(test_data))
print(textstat.automated_readability_index(test_data))
print(textstat.dale_chall_readability_score(test_data))
print(textstat.difficult_words(test_data))
print(textstat.linsear_write_formula(test_data))
print(textstat.gunning_fog(test_data))
print(textstat.text_standard(test_data))
Run
52.23
12.5
12.8
11.03
15.5
6.72
9
16.333333333333332
12.38
12th and 13th grade
近期文章
更多推荐
所有评论(0)