Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Inflections and the Dale-Chall-Formula #150

@LKirst

Description

@LKirst

The textstat implementation of the Dale-Chall-Formula classifies several words as difficult words that the original Dale-Chall-Formula would not. For example, Scotland, returned, giants, giant's, strongest are returned as part of textstat.difficult_words_list(text), even though the base forms return, giant, strong are all part of the easy words list.

Dale and Chall (1948, p. 38-49) suggest that the following word forms should be considered familiar:

  • names of persons and places
  • regular plurals and possessives of words on the list
  • the third-person, singular forms (s or ies from y), present-participle forms (ing), past-participle forms (n), and past-tense forms (ed or ied from y), when these are added to verbs appearing on the list
  • comparatives and superlatives of adjectives appearing on the list
  • adverbs familiar which are formed by adding ly to a word on the list

The complete list of rules can be found in Dale & Chall (1948).

I understand that most of these rules are not easy to implement for the textstat package, but to avoid confusion and maybe prompt users to check the list returned by textstat.difficult_words_list(text), the README could point out the deviation from the original Dale & Chall formula?

Source: Dale, E., & Chall, J. (1948). A Formula for Predicting Readability: Instructions. Educational Research Bulletin, 27(2), 37-54. Retrieved August 11, 2021, from http://www.jstor.org/stable/1473669

Metadata

Metadata

Assignees

No one assigned

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions