[BUG] Windows-1252 encoding is not detected in turkish text

`charset_normalizer` returns `None`

```
$ chardetect star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt
star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt: Windows-1252 with confidence 0.73

$ file -i star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt
star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt: application/x-subrip; charset=iso-8859-1

$ python -c "import charset_normalizer; print(charset_normalizer.from_path('star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt').best())"
None
```

who is right? chardetect is right! the expected encoding is Windows-1252

`iso-8859-1` produces an ugly [<U+0085>](https://www.fileformat.info/info/unicode/char/85/index.htm) when piped to `less` (utf16 hex bytes)
or `c285` as utf8 hex bytes

[unicode-explorer.com/c/0085](https://unicode-explorer.com/c/0085)

> U+0085: The "Next Line" (NEL) control character was used in the 1970s for controlling printers and displays (e.g. VT100). Moves to the first position of the next line.

```diff
--- star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt.iso-8859-1
+++ star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt.Windows-1252
@@ -2242,7 +2242,7 @@
 
 505
 00:43:04,098 --> 00:43:05,428
-Adil davranmaktan bahsetmiþken<U+0085>
+Adil davranmaktan bahsetmiþken…
 
 506
 00:43:06,771 --> 00:43:09,777
```

## input file

`star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt`

- [005093393.star.trek.the.next.generation.s01.e02.the.naked.now.(1987).tur.1cd.(5093393).zip](https://github.com/Ousret/charset_normalizer/files/13799054/005093393.star.trek.the.next.generation.s01.e02.the.naked.now.1987.tur.1cd.5093393.zip)
- https://www.opensubtitles.org/en/subtitles/5093393


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Windows-1252 encoding is not detected in turkish text #407

input file

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Uh oh!

[BUG] Windows-1252 encoding is not detected in turkish text #407

Description

input file

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions