If you use these data please cite
- the original source
Wichmann, Søren, Eric W. Holman, Cecil H. Brown, Matthew S. Dryer, and Qibin Ran (eds.). 2025. The ASJP Database (version 21).
- the derived dataset using the DOI of the particular released version you were using
This dataset is licensed under a CC-BY-4.0 license
Available online at https://asjp.clld.org
Conceptlists in Concepticon:
The database of the Automated Similarity Judgment Program (ASJP) aims to contain 40-item word lists of all the world's languages.
- Varieties: 11,540 (linked to 6,126 different Glottocodes)
- Concepts: 100 (linked to 100 different Concepticon concept sets)
- Lexemes: 568,820
- Sources: 11,335
- Synonymy: 1.10
- Invalid lexemes: 0
- Tokens: 2,380,315
- Segments: 336 (6 BIPA errors, 6 CLTS sound class errors, 327 CLTS modified)
- Inventory size (avg): 23.79
-
Languages linked to bookkeeping languoids in Glottolog:
- BAYNUNK_GUJAXER bain1260
- BAYNUNK_GUJAXER_2 bain1260
- BAYNUNK_GUJAXER_3 bain1260
- BAYNUNK_GUJAXER_4 bain1260
- BAYNUNK_GUJAXER_5 bain1260
- BAYNUNK_GUJAXER_6 bain1260
- BUMANG buma1247
- GENGLE geng1243
- IHIEE ihie1238
- JEWISH_BERBER jude1262
- KATUKINA nucl1668
- KEMIE_YUNNAN_MANMI kemi1240
- MENBA_XIZANG_CUONA tawa1289
- NOCAMAN noca1240
- OLD_TURKIC oldt1247
- RUFIJI rufi1234
- SAMRE samr1245
- YI_YUNNAN_PUER_MOJIANG sout3128
-
Entries missing sources: 12015/568820 (2.11%)
| Name | GitHub user | Description | Role |
|---|---|---|---|
| Søren Wichmann | Author, Distributor, DataCurator, Editor, DataCollector | ||
| André Müller | DataCollector | ||
| Ann-Katrin Wett | DataCollector | ||
| Viveka Velupillai | DataCollector | ||
| Julia Bischoffberger | DataCollector | ||
| Eric W. Holman | Author, Editor | ||
| Cecil H. Brown | DataCollector, Author, Editor | ||
| Sebastian Sauppe | DataCollector | ||
| Zarina Molochieva | DataCollector | ||
| Pamela Brown | DataCollector | ||
| Oleg Belyaev | DataCollector | ||
| Johann-Mattis List | @LinguList | DataCollector | |
| Dmitry Egorov | DataCollector | ||
| Matthias Urban | DataCollector | ||
| Robert Mailhammer | DataCollector | ||
| Agustina Carrizo | DataCollector | ||
| Matthew S. Dryer | DataCollector | ||
| Evgenia Korovina | DataCollector | ||
| David Beck | DataCollector | ||
| Helen Geyer | DataCollector | ||
| Patience Epps | DataCollector | ||
| Anthony Grant | DataCollector | ||
| Arjan Mossel | DataCollector | ||
| Darja Appelganz | DataCollector | ||
| Dickson Pagente | DataCollector | ||
| Danli Wu | DataCollector | ||
| Guillaume Segerer | DataCollector | ||
| Ke Xu | DataCollector | ||
| Mark Donohue | DataCollector | ||
| Matthias Pache | DataCollector | ||
| Pengfei Chen | DataCollector | ||
| Paul Sidwell | DataCollector | ||
| Qibin Ran | DataCollector | ||
| Tessa de Mol-van Valen | DataCollector | ||
| Yuzhu Liang | DataCollector | ||
| Yue Sun | DataCollector | ||
| Robert Forkel | @xrotwang | patron, code | Other |
| Tiago Tresoldi | @tresoldi | profile, language mapping refinement | Other |
The following CLDF datasets are available in cldf:
- CLDF Wordlist at cldf/cldf-metadata.json