UTF-8 and Oracle Access Manager 10g


04/01/2010

OAM supports UTF-8 in incoming data, and can generate HTML pages encoded with UTF-8, but what about internally? Is UTF-8 data available in plugins? In HTTP header variables?

We tested 10.1.4.3 on Windows and were surprised that our UTF-8 data was being interpreted incorrectly in our managed plugins (though exec ppp plugins worked as expected).

The character Û ( U with a circumflex) has a code point value of 219 (all numbers are decimal). In UTF-8 this is encoded as the bytes 195 & 155. However, when this text reaches our plugin it appears as Û (A with tilde & single right-pointing angle quotation mark). In .NET Strings are in unicode, so we know something is happening with the identity server to re-interpret the bytes 195 & 155 as some other encoding and then to provide us that String as unicode. That encoding turns out to be Windows-1252 – the default code page on our Windows system. 195 is Ã, while 155 is ›. Luckily there is a simple workaround – we get the Windows-1252 byte value of the string and then interpret those bytes at UTF-8.

Using Reflector I can see a few calls to   in the managed library, and I would guess a similar call like   is used for converting between unmanaged and managed memory, and this may be a cause of the issue.

This issue also exists in the Access Server. If you want to send a UTF-8 attribute value in a header, OAM is smart enough to base 64 encode it (according to RFC 2047 ). So our value should be encoded using this format  . Unfortunately, the text to be encoded is incorrect – the access server is B64 encoding the Windows-1252 interpretation of the UTF-8 bytes. You’ll need to B64 decode the header text and then use the re-encoding code shown earlier to get the real value.

One thing to note is that if your default code page is something other then Windows-1252, you’ll proably have to interpret the string using that code page.

No Results