heurisko bug no. 6094

This applies to heurisko 6.3.0 and higher.

Symptoms

a) Although the encoding attribute in the XML files created by heurisko is set to "UTF-8", you cannot store string data that contains characters outside the ASCII character set (character codes 0 - 127)

b) Furthermore, strings with multiple lines cannot be read correctly.

Cause

a) Internally, heurisko strings are not UTF-8 encoded. Thus, they must be converted before they are written into an UTF-8 encoded XML file and when they are read from the file. In the current code this conversion is missing. The conversion is only needed for characters outside the ASCII character set. Thus, all ASCII characters including numbers are handled correctly. Characters beyond the last ASCII character are handled correctly only if the ÚTF-8 encoding is equal to the current internal encoding which may depend on the cultural settings in the operating system.

b) For reading strings from XML files line breaks are used as separators between multiple strings. Thus, it is not possible to differentiate between multiple single-line strings and a single multiline string. Currently it is possible to write multiline strings but they cannot be read back as multiline strings.

Workaround

a) Use ASCII characters only.

b) There is no general workaround for multiline strings. You may be able to define a string array with as much array elements as are lines in the multiline string and copy each line from the multiline string into its own element in the string array before writing into the XML file.

Fix

6.4.0.2.

Back to overview