Conversation
This comment was marked as resolved.
This comment was marked as resolved.
This comment was marked as outdated.
This comment was marked as outdated.
|
@SunsetTechuila Could you please add the |
Should be implemented on the vscode side |
Could you clarify what you mean with this? My understanding is that the charset option indicates the encoding to use when reading/writing the file, which is important if it can’t be guessed correctly by vscode. This is the problem I have, vscode guesses utf8 as the encoding for old projects that use iso8859-1 and I can’t change. So, I would create a .editorconfig with charset=latin1. I guess the conversion would make sense if the file is UTF16 and the .editorconfig charset is set to UTF8 or vice versa. The problem is that, if there is no BOM, there is no bulletproof way to determine the current file encoding (which is the whole point of the charset option). So, maybe the logic should be to use:
|
Opening with autodetection and then switching to the specified encoding
For this use case, the behavior I described would be problematic. Maybe we should add an option to disable encoding conversion. Or, should we just not introduce it at all? @xuhdev, what's your opinion? How do other plugins handle this? |
|
Hi, is there any progress on this PR? Would be very useful to have this feature. |
No, first I need to check how other editors and plugins handle encoding - do they simply read files in the target encoding, or detect the file's actual encoding and re-save it in the target one? Feel free to share info on this if you'd like to help speed up merging the PR |
|
As far as I know e.g. QtCreator displays a warning if the specified encoding is not the same used within the files. Neovim e.g. opens the file in the target encoding and saves it in the target encoding. IMO the safest way is to ASSUME that the file is in the target encoding. Because detecting encodings is always a gamble. |
|
@styx3r If an UTF BOM is present, there's no guesswork, it indicates the encoding the file uses. So, if an UTF BOM is present and I believe the only sensible thing to do is opening the file with the UTF encoding indicated by the BOM, if present, otherwise use what @SunsetTechuila I think the main issue here is that the Personally, I would be concerned about silently converting files encoding without user explicit consent. I did a code search on this project to see what other plugins do and it looks like most of them don't support it at all. So there doesn't seem to be a de-facto standard to follow. |
|
@marcoburato I agree with the BOM argument. BUT u could always end up opening a binary file which starts with those 3 bytes and then u would tinker with the binary file. IMO it should always be the users responsibility to ensure correct encoding in the used repository. This is IMO also the reason why there is no official definition how to implement the + if the encoding is changed it should be catched during PR review anyhow. |
Sorry, I don't understand this point... A binary file is by definition not a text file, there's no correct text encoding to use to read it. It makes no sense to open a binary file in a text editor. Of course, one can always open a binary file by mistake. In that case, we should avoid corrupting it by attempting to do an automatic text encoding conversion when the file is opened. This is a good example of why an automatic conversion could be problematic. In my opinion, we should never attempt to convert a file when it's just opened. If anything, it should be done when saving it. But again, I think it would cause more bad than good. Anyway, it would be appropriate to have a test case for this scenario. So, it's good that you brought it up.
I agree, better do whatever possible to leave the files as they are and let the users deal with conversions.
Well, if the spec doesn't fully define how it works, what's the point of the spec? I could understand if this was some kind of edge case, like very old encodings not officially supported by .editorconfig, but the issues we're discussing feel pretty substantial. Perhaps this is why virtually no plugin actually supports the I think there's not much point to implement it in a certain way in VScode without improving the spec. Otherwise, other plugins could eventually be implemented with incompatible behaviour. |
@marcoburato that's what I meant. One could open a binary file by accident and then change the encoding without any manual interaction.
As already mentioned, nvim supports it https://neovim.io/doc/user/plugins.html#editorconfig by default.
where it states
So as far as I understand it, nvim DOES NOT convert the file to UTF-8 while reading it. It only converts it if it's saved to disk. IMO VSCode could do it similar. |
resolves #35
Please fill in this template.
tscw/o errors (same asnpm run build).npm run lintw/o errors.