Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unescaped '<' not allowed in attributes values error in R

Tags:

parsing

r

xml

I have a number of "raw" subject data in XML format that I need to read into a data table to process some summary statistics. The program I used for testing gives me the following output (snippet of one event within the file):

    <Event>
      <Data name="Relation1" value="<"></Data>
      <Data name="Relation2" value="4    R"></Data>
      <Data name="Group" value="0"></Data>
      <Data name="CorrResult" value="S"></Data>
      <Data name="Response" value="S"></Data>
      <Data name="RT" value="787"></Data>
      <Data name="Result" value="C"></Data>
      <Data name="Gap" value="0"></Data>
      <Data name="IntraGap" value="0"></Data>
      <Data name="ISI" value="0"></Data>
    </Event>

The first data field "Relation1" will always have a value as either "<" or ">". Is there a way I can ask R to recognize that as a data value and not the start of a new attribute value? I've tried a number of things using the XML and XML2R packages and always end up with the error at the top of a long list of errors.

like image 662
MikeFaneros Avatar asked Mar 17 '14 19:03

MikeFaneros


1 Answers

In XML < or & are strictly illegal to use as values. There are more which are not strictly illegal but are better avoided. Use an entity reference instead. So your XML file generates an error. The entity reference for < is &lt; and > is &gt;

If you can't change the output procedure, you can write a procedure to change the file in a text-based way; I mean, read the file line by line. If relation1 is detected, change the first < or > after that; after that it should work. I can't show you how this is done in R as I don't know the language.

I used the source below for reference.

http://www.w3schools.com/xml/xml_syntax.asp

like image 129
pvl Avatar answered Nov 10 '22 22:11

pvl