Setting encoding for a text file in the URL ( #UTF-8 at the end e.g.) does not work
I would like on a web page to instruct browser that the text file referenced by a link has a specific encoding -- in my case UTF-8. Neither specifying mapping by charset="UTF-8" in the HTTP A element nor concatenating #UTF-8 to the end of href value works with Firefox which seem to simply use its default encoding always... Is there a way to divert from Firefox default encoding by specifying information on the linking page? (text files have no header to have meta tags)
Alle antwoorden (5)
For text files opened locally there is a pref to force UTF-8 encoding. For links on a web page Firefox relies on what the server sends. If the server doesn't send an encoding the you can override this via "View -> Text Encoding". If the server sends an encoding that this is the wrong encoding then you may not be able to override this. These days servers should send files by default as utf-8 encoding and no longer use the 8 bit Western or Windows encoding.
I have checked by Inspect Element and found server seems to send utf-8 for the linking page correctly, and sends just a plaintext.css for the plain text file where I cannot see anything about character set.
The linking page is HTML 4.01 Transitional, so the charset attribute for the link is according to the HTML specs and which seems not to be interpreted by Firefox.
When I open the link Central European (ISO) is selected automatically in View -> Text Encoding -- I have played around editing preferences Fonts & Colors -> Advanced -> Text Encodig for Legacy Content, but irrespective of my choice there always Central European (ISO) is set in View -> Text Encoding I have not started restarting Firefox, just Ctrl-F5 for both the linking page and the linked plaintext file.
I would expect Firefox to properly handle at least one of
- charset="utf-8" in the linking A element or
- the #UTF-8 suffix in the URI.
I have even tried playing with adding attribute type = "text/plain; charset=UTF-8" to the link, but the behavior has not changed [View -> Text -> Encoding: Central-European (ISO) selected automatically]
In the meanwhile I could achieve what I wanted by playing around with .htaccess files. For some reason the server did not respect AddCharset just AddDefaultCharset. Hence I had to create separate sub directories for files with different encodings.
However charset attribute of the A element has become obsolete only in HTML 5. I would expect browsers to recognize that for valid HTML 4.01 documents.