Avatar billede fredand Forsker
02. november 2006 - 14:28 Der er 11 kommentarer og
1 løsning

Why doesn't encoding=utf-8 works for swedish chars?

Hello!

I thought that the encoding=utf-8 in a xml would accept all chars in Unicode.

(I guess the swedish chars "ÅÄÖ" is available in Unicode, and I also guess UTF-8 is Unicode).

How ever if create a xml like:
<?xml version="1.0" encoding="utf-8"?>
<head>
    <type>ABC</type>
</head>

This works fine in Internet Explorer.

But if I create one like:
<?xml version="1.0" encoding="utf-8"?>
<head>
    <type>ABC ÅÄÖ</type>
</head>

It doesn't work at all?

Have I missed something fundamental here?

If I use ISO-8859-1 it works but why doesn't Unicode work, if I'm right when saying UTF-8 is Unicode?

Best regards
Fredrik
Avatar billede webcreator Nybegynder
02. november 2006 - 14:50 #1
You need to save the file as UTF-8 - it's not enough to just "mark" the file as UTF-8 using the 'encoding' attribute.
Avatar billede webcreator Nybegynder
02. november 2006 - 14:51 #2
.. your editor probably uses ISO-8859-1 or ISO-8859-15. Check your settings - you might be able to set it to use UTF-8 instead.
Avatar billede janegil Nybegynder
02. november 2006 - 20:01 #3
Cherry sauce (körsbärsås) in UTF-8 Swedish:
körsbärsås - KÖRSBÄRSÅS

Bad things may happen to this before it reaches your clipboard, but try pasting it into your utf-8 xml.

PS: In Norwegian, we use bluberry jam: blåbærsyltetøy (blÃ¥bærsyltetøy).
Avatar billede fredand Forsker
03. november 2006 - 12:25 #4
Hello!

The: körsbärsås
Worked fine and was presented as "körsbärsås".

And the: KÖRSBÄRSÅS
Worked fine and was presented as "KÖRSBÄRSÅS".

Both also lokked fine in view source.

Is this something I could use? And how do I use it?

I'm not sure that I undersatnd this magic.

Best regards
Fredrik
Avatar billede fredand Forsker
03. november 2006 - 12:28 #5
I also tried to save the orginal file as UTF-8

It also works fine if then.


My problem is that a Java App creates a xml-string. Send it over JMS and it gets corrupted at the receiver point.

Could I use your ideas in some way?

Best regards
Fredrik
Avatar billede janegil Nybegynder
03. november 2006 - 12:48 #6
Maybe you should let your Java create character entities. Again, I don't know what you will see when I write this: k&#x00F6;rsb&#x00E4;rs&#x00E5;s

But I have a page at http://styrheim.com/tools/entify.html that might be some help. Or maybe Java has some function to create character entities of your non-ASCII?
Avatar billede janegil Nybegynder
03. november 2006 - 12:48 #7
(yes, it looks as I inntended)
Avatar billede janegil Nybegynder
03. november 2006 - 12:54 #8
The 'right' way is using utf-8 encoded characters. But using character entities is safer, they generally let unicode characters survive through encodings were they are missing, and software conversion errors. But do not use &ouml; use the XML compliant &#x00F6;.
Avatar billede janegil Nybegynder
03. november 2006 - 12:56 #9
The magic: if the software assumes than one character is one byte, utf-8 gets screwed up. In utf-8, ASCII characters are one byte, while other characters may be two or more bytes.
Avatar billede webcreator Nybegynder
03. november 2006 - 15:24 #10
I'm pretty sure UTF-8 is multibyte while ie ISO-8859-1 is singlebyte
Avatar billede janegil Nybegynder
03. november 2006 - 16:56 #11
How do you control the output from Java? Do you control the Java code yourself? Or is changin the input the only way to change the output?
Avatar billede fredand Forsker
29. december 2010 - 21:04 #12
Hola amigos!

I close this one since it is old!

Best regards
Fredrik
Avatar billede Ny bruger Nybegynder

Din løsning...

Tilladte BB-code-tags: [b]fed[/b] [i]kursiv[/i] [u]understreget[/u] Web- og emailadresser omdannes automatisk til links. Der sættes "nofollow" på alle links.

Loading billede Opret Preview
Kategori
Kurser inden for grundlæggende programmering

Log ind eller opret profil

Hov!

For at kunne deltage på Computerworld Eksperten skal du være logget ind.

Det er heldigvis nemt at oprette en bruger: Det tager to minutter og du kan vælge at bruge enten e-mail, Facebook eller Google som login.

Du kan også logge ind via nedenstående tjenester



Seneste spørgsmål Seneste aktivitet
I går 23:37 Poe strøm Af lurup i LAN/WAN
I går 14:46 GIF-EDITOR Af snestrup2000 i Billedbehandling
I går 14:03 Logge ind Af Bob i PC
I går 12:12 2 skærme - 1 virker - den anden siger No signal Af eksmojo i Skærme
I går 10:33 openvpn projekt Af dcedata1977 i Windows