Ruby on Rails, ReXML Document serializing / deserializing

I’m storing an XML document into a database field. I’m having a lot of trouble loading and saving the same XML in the database.

Here’s a script/console session:

>> xml = REXML::Document.new(“<root value=” />”)   => <UNDEFINED> … </>  # strange response!!
>> xml.to_s   => “<root value=” /> “   # seems ok!
>> xml.root.attributes[‘value’] = ‘<’ 
>> xml.to_s => “<root value=’&lt;’ />”  # fine by me, no problem…

>> # now for the scary part ;-)
>> xml2 = REXML::Document.new( xml.to_s )   
>> xml2.to_s  => “<root value=’&amp;&lt;’ />”

The HORROR!

ReXML seems to escape items very nicely when setting values.
But it doesn’t unescape the values with REXML::Document.new( … )..

Current Progress:
* I found a method REXML::Document.write( ) which seems to do the same..

Today (24-8-2007) I’m a bit further, It seems it works correctly with the text content of elements:

>> xml = REXML::Document.new(“<root value=” />”)   => <UNDEFINED> … </>  # strange response!!
>> xml.to_s   => “<root value=” /> “   # seems ok!
>> xml.root.attributes[‘value’] = ‘<’ 
>> xml.root.text = ‘>’
>> xml.to_s => “<root value=’&lt;’>&gt;</root>”  # fine by me, no problem…

>> xml2 = REXML::Document.new( xml.to_s )   
>> xml2.to_s  => “<root value=’&amp;&lt;’>&gt;</root>”

Update 2 (24-8-2007) I found my Windows Ruby on Rails REXML (1.8.4) installation is working perfectly. It seems a bug in the FreeBSD version which is REXML (1.8.6).
I’m trying to submit a bug report to the REXML authors, but the server keeps timing out :(

I found the solution, there’s indeed a bug in REXML (1.8.6)

Change the code at line +/- 291 in text.rb: ( /usr/local/lib/ruby/1.8/rexml/text.rb )

#copy = copy.gsub( EREFERENCE, ‘&amp;’ )
copy = copy.gsub( “&”, “&amp;” )

To

copy = copy.gsub( EREFERENCE, ‘&amp;’ )
#copy = copy.gsub( "&", "&amp;" )
 

3 Comments so far

  1. Sam Ruby on December 10th, 2007
  2. Sam Ruby on December 10th, 2007
  3. Sam Ruby on December 11th, 2007

    REXML and Mangled Text…

    A bare minimum amount of functionality that one would expect from an XML parsing library is the ability to round-trip data.  If you parse a document and immediately reserialize the result, you would expect to get the origin…

Leave a reply