charset_normalizer returns None
$ chardetect star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt
star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt: Windows-1252 with confidence 0.73
$ file -i star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt
star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt: application/x-subrip; charset=iso-8859-1
$ python -c "import charset_normalizer; print(charset_normalizer.from_path('star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt').best())"
None
who is right? chardetect is right! the expected encoding is Windows-1252
iso-8859-1 produces an ugly <U+0085> when piped to less (utf16 hex bytes)
or c285 as utf8 hex bytes
unicode-explorer.com/c/0085
U+0085: The "Next Line" (NEL) control character was used in the 1970s for controlling printers and displays (e.g. VT100). Moves to the first position of the next line.
--- star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt.iso-8859-1
+++ star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt.Windows-1252
@@ -2242,7 +2242,7 @@
505
00:43:04,098 --> 00:43:05,428
-Adil davranmaktan bahsetmiþken<U+0085>
+Adil davranmaktan bahsetmiþken…
506
00:43:06,771 --> 00:43:09,777
input file
star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt
charset_normalizerreturnsNonewho is right? chardetect is right! the expected encoding is Windows-1252
iso-8859-1produces an ugly <U+0085> when piped toless(utf16 hex bytes)or
c285as utf8 hex bytesunicode-explorer.com/c/0085
input file
star_trek_tng_-_season_1_ep_03_-_the_naked_now.srt