rebXR Unicode Behaviour
by earl, 6977 days ago
as Unicode support and string handling in general is one of the shaky, unspecified parts within the XML-RPC spec, an explicit declaration of rebXR's behaviour:

  • REBOL does not support unicode

  • rebXR passes bytes thru to the application layer - unmangled, in whatever encoding they originally were

  • rebXR is able to parse character entity references as long as the character referenced is within 0x00 and 0xFF

    • rebXR handles the formats &#nnn and &#xnn

    • rebXR handles the "well known" entities lt, gt, amp, quot and apos

    • if the charcter referenced is not within 0x00 and 0xFF rebXR fails. so on encountering e.g. Ā rebXR will throw an exception (an REBOL error!). no silent corruption takes place.
