Hi Erol --
When XML documents are parsed, any numeric character reference -- like ® -- can be parsed, giving U+00AE as a code point which will look like ® in the text, and which might, in this Unicode world, be the best option, or the parser can have a setting that will escape all the entity references, so you get the & turning into &. There isn't any way to avoid one or the other because of some intractable XML rules about entities.
Remember that so far as a parsed XML document is concerned, there's no difference between ® and ®; it's all the same U+00AE codepoint. Trying to maintain entities in a parsed XML document is never the simple way to do it.
I'd generally approach this by converting all the entities to code point representations of the actual characters on the input side, and worry about which, if any, entity representations I needed on the output side. Either the serializer or character mapping should be able to handle converting any codepoints you want to represent as HTML entities in the final output.
-- Graydon