in Hacking

Ruby on Rails, ReXML Document serializing / deserializing

I'm storing an XML document into a database field. I'm having a lot of trouble loading and saving the same XML in the database.

Here's a script/console session:

>> xml = REXML::Document.new("")   =>  ...   # strange response!!
>> xml.to_s   => " "   # seems ok!
>> xml.root.attributes['value'] = '<'  
>> xml.to_s => ""  # fine by me, no problem...

>> # now for the scary part ;-)
>> xml2 = REXML::Document.new( xml.to_s )    
>> xml2.to_s  => ""

The HORROR!

ReXML seems to escape items very nicely when setting values.
But it doesn't unescape the values with REXML::Document.new( ... )..

Current Progress:
* I found a method REXML::Document.write( ) which seems to do the same..

Today (24-8-2007) I'm a bit further, It seems it works correctly with the text content of elements:

>> xml = REXML::Document.new("")   =>  ...   # strange response!!
>> xml.to_s   => " "   # seems ok!
>> xml.root.attributes['value'] = '<'  
>> xml.root.text = '>'
>> xml.to_s => ">"  # fine by me, no problem...

>> xml2 = REXML::Document.new( xml.to_s )    
>> xml2.to_s  => ">"

Update 2 (24-8-2007) I found my Windows Ruby on Rails REXML (1.8.4) installation is working perfectly. It seems a bug in the FreeBSD version which is REXML (1.8.6).
I'm trying to submit a bug report to the REXML authors, but the server keeps timing out :(

I found the solution, there's indeed a bug in REXML (1.8.6)

Change the code at line +/- 291 in text.rb: ( /usr/local/lib/ruby/1.8/rexml/text.rb )

#copy = copy.gsub( EREFERENCE, '&' )
copy = copy.gsub( "&", "&" )

To

copy = copy.gsub( EREFERENCE, '&' )
#copy = copy.gsub( "&", "&" )

Comments are closed.

Webmentions

  • Sam Ruby 10 December, 2007

    REXML and Mangled Text…

    A bare minimum amount of functionality that one would expect from an XML parsing library is the ability to round-trip data.  If you parse a document and immediately reserialize the result, you would expect to get the origin…