Wrong identification of text with mixed languages

I noticed that Lingua identifies  wrongly a text which includes portion of foreign words

This is an example of a Korean text, which includes the string "CA" which is not Korean (probably this represents the initials of a person)
 ( 웃음 ) CA : 실패하는군요 . 안타깝네요 .

This text is identified as "Romanian".
This is a bit strange since there are 3 Koreans tokens and only one (probably) not Korean.

Actually, also the punctuation marks should be "Korean".

Any idea why this wrong identification occurs?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Wrong identification of text with mixed languages #76

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Wrong identification of text with mixed languages #76

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions