Ruby on Rails, ReXML Document serializing / deserializing
I’m storing an XML document into a database field. I’m having a lot of trouble loading and saving the same XML in the database.
Here’s a script/console session:
>> xml.to_s => “<root value=” /> “ # seems ok!
>> xml.root.attributes[‘value’] = ‘<’
>> xml.to_s => “<root value=’<’ />” # fine by me, no problem…
>> # now for the scary part ;-)
>> xml2 = REXML::Document.new( xml.to_s )
>> xml2.to_s => “<root value=’&<’ />”
The HORROR!
ReXML seems to escape items very nicely when setting values.
But it doesn’t unescape the values with REXML::Document.new( … )..
Current Progress:
* I found a method REXML::Document.write( ) which seems to do the same..
Today (24-8-2007) I’m a bit further, It seems it works correctly with the text content of elements:
>> xml.to_s => “<root value=” /> “ # seems ok!
>> xml.root.attributes[‘value’] = ‘<’
>> xml.root.text = ‘>’
>> xml.to_s => “<root value=’<’>></root>” # fine by me, no problem…
>> xml2 = REXML::Document.new( xml.to_s )
>> xml2.to_s => “<root value=’&<’>></root>”
Update 2 (24-8-2007) I found my Windows Ruby on Rails REXML (1.8.4) installation is working perfectly. It seems a bug in the FreeBSD version which is REXML (1.8.6).
I’m trying to submit a bug report to the REXML authors, but the server keeps timing out :(
I found the solution, there’s indeed a bug in REXML (1.8.6)
Change the code at line +/- 291 in text.rb: ( /usr/local/lib/ruby/1.8/rexml/text.rb )
copy = copy.gsub( “&”, “&” )
To
#copy = copy.gsub( "&", "&" )
http://www.germane-software.com/projects/rexml/ticket/122
Original change was made here: http://www.germane-software.com/projects/rexml/changeset/1235
REXML and Mangled Text…
A bare minimum amount of functionality that one would expect from an XML parsing library is the ability to round-trip data. If you parse a document and immediately reserialize the result, you would expect to get the origin…