Thanks to visit codestin.com
Credit goes to github.com

Skip to content

XmlXPathReader problem #1040

@rezahay

Description

@rezahay

According to Richard, the cause of the XmlXPathReader problem, which is explained below, is as follows:
"It looks like the processNode() method in XmlXPathReader expects the nodes selected by PARAM_XPATH_EXPRESSION to have sub-elements. It doesn't seem to consider that the
nodes only have text content" (Richard).

The problem description is as follows:
I have problems using XmlXPathReader. I tried to parse the xml file provided by Dkpro test (full_tag_format.xml) with the following code: 
================ 
CollectionReaderDescription reader = createReaderDescription(XmlXPathReader.class, 
                                XmlXPathReader.PARAM_LANGUAGE, "en", 
                                XmlXPathReader.PARAM_SOURCE_LOCATION, "input", 
                                XmlXPathReader.PARAM_XPATH_EXPRESSION, "/topics/topic/description", 
                                XmlXPathReader.PARAM_PATTERNS, new String[] { "[+]full*.xml" }); 
================ 
I get "sofaString="[Parse error]": 
================ 
<xmi:XMI xmlns:WhatAliceDoesExample="http:///de/tudarmstadt/ukp/tutorial/gscl2013/ruta/WhatAliceDoesExample.ecore" xmlns:pos="http:///de/tudarmstadt/ukp/dkpro/core/api/lexmorph/type/pos.ecore" xmlns:tcas="http:///uima/tcas.ecore" xmlns:xmi="http://www.omg.org/XMI" xmlns:cas="http:///uima/cas.ecore" xmlns:type9="http:///org/apache/uima/ruta/type.ecore" xmlns:html="http:///org/apache/uima/ruta/type/html.ecore" xmlns:tweet="http:///de/tudarmstadt/ukp/dkpro/core/api/lexmorph/type/pos/tweet.ecore" xmlns:morph="http:///de/tudarmstadt/ukp/dkpro/core/api/lexmorph/type/morph.ecore" xmlns:dependency="http:///de/tudarmstadt/ukp/dkpro/core/api/syntax/type/dependency.ecore" xmlns:type5="http:///de/tudarmstadt/ukp/dkpro/core/api/semantics/type.ecore" xmlns:type8="http:///de/tudarmstadt/ukp/dkpro/core/api/transform/type.ecore" xmlns:type7="http:///de/tudarmstadt/ukp/dkpro/core/api/syntax/type.ecore" xmlns:type2="http:///de/tudarmstadt/ukp/dkpro/core/api/metadata/type.ecore" xmlns:type3="http:///de/tudarmstadt/ukp/dkpro/core/api/ner/type.ecore" xmlns:type4="http:///de/tudarmstadt/ukp/dkpro/core/api/segmentation/type.ecore" xmlns:type="http:///de/tudarmstadt/ukp/dkpro/core/api/coref/type.ecore" xmlns:type6="http:///de/tudarmstadt/ukp/dkpro/core/api/structure/type.ecore" xmlns:constituent="http:///de/tudarmstadt/ukp/dkpro/core/api/syntax/type/constituent.ecore" xmlns:chunk="http:///de/tudarmstadt/ukp/dkpro/core/api/syntax/type/chunk.ecore" xmi:version="2.0"> 
<cas:NULL xmi:id="0"/> 
<type2:DocumentMetaData xmi:id="1" sofa="12" begin="0" end="13" language="en" documentTitle="full_tag_format.xml" documentId="full_tag_format.xml" documentUri="file:/C:/Workspace/LunaWS/DkproTrial-2/input/full_tag_format.xml" collectionId="input" documentBaseUri="file:/C:/Workspace/LunaWS/DkproTrial-2/input/" isLastSegment="false"/> 
<type4:Sentence xmi:id="19" sofa="12" begin="0" end="13"/> 
<type4:Token xmi:id="24" sofa="12" begin="0" end="6"/> 
<type4:Token xmi:id="34" sofa="12" begin="7" end="12"/> 
<type4:Token xmi:id="44" sofa="12" begin="12" end="13"/> 
<cas:Sofa xmi:id="12" sofaNum="1" sofaID="_InitialView" mimeType="text" sofaString="[Parse error]"/> 
<cas:View sofa="12" members="1 19 24 34 44"/> 
</xmi:XMI> 
================= 
However if I try with 
XmlXPathReader.PARAM_XPATH_EXPRESSION, "/topics/topic", 
it goes fine and I get partially correct answer: 
=================== 
2 
Gender bias and poverty 
Find documents on gender bias and resultant poverty 
                        problems. 
Documents and research reports that look into gender bias 
                        as well as its revelations on poverty problems. 
=================== 
Please note that the answer is partially correct because there are two topic tags and not one. So, I think this is another bug.

Thanks

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions