IR4QA (Embedded CLIR) Output Format

IR4QA participants will use the XML format to submit the result.

Tag

Description

TOPIC_SET

Contains a meta data and a list of topics

METADATA

Must include meta information on run id, and system description. If the run is related with any question analysis run, specify its run id.

TOPIC

Each TOPIC is associated with IR4QA_RESULT.

IR4QA_RESULT

Contains a ranked list of DOCUMENT

DOCUMENT

Pointer to the document in the corpus. SCORE is optional but you are recommended to produce this value (preferably between 0 and 1).

For the definition of Run ID, refer to RunIDFormat.

XML DTD

<!DOCTYPE TOPIC_SET [
<!ELEMENT TOPIC_SET (METADATA,TOPIC*)>
<!ELEMENT METADATA (RUNID,DESCRIPTION,QUESTION_ANALYSIS_RUN?)>
<!ELEMENT RUNID (#PCDATA)>
<!ELEMENT DESCRIPTION (#PCDATA)>
<!ELEMENT QUESTION_ANALYSIS_RUN (#PCDATA)>
<!ELEMENT TOPIC (IR4QA_RESULT)>
<!ATTLIST TOPIC ID CDATA #REQUIRED>
<!ELEMENT IR4QA_RESULT (DOCUMENT*)>
<!ELEMENT DOCUMENT EMPTY>
<!ATTLIST DOCUMENT RANK CDATA #REQUIRED>
<!ATTLIST DOCUMENT DOCID CDATA #REQUIRED>
<!ATTLIST DOCUMENT SCORE CDATA #IMPLIED>
]>

Sample

<TOPIC_SET>

  <METADATA>
    <RUNID>TEAMX-EN-JA-01-T</RUNID>
    <DESCRIPTION>Ranked based on cosine similarity of tf.idf weighted term vectors.</DESCRIPTION>
    <QUESTION_ANALYSIS_RUN>TEAMX-EN-JA-01-T</QUESTION_ANALYSIS_RUN>
  </METADATA>

  <TOPIC ID="ACLIA2-JA-T0001">
    <IR4QA_RESULT>
      <DOCUMENT RANK="1" DOCID="JA-010101032" SCORE="1.00" />
      <DOCUMENT RANK="2" DOCID="JA-001116222" SCORE="0.92" />
      <DOCUMENT RANK="3" DOCID="JA-001110059" SCORE="0.91" />
      <DOCUMENT RANK="4" DOCID="JA-990825062" SCORE="0.91" />
    </IR4QA_RESULT>
  </TOPIC>

  <TOPIC ID="ACLIA2-JA-T0002">
    <IR4QA_RESULT>
      <DOCUMENT RANK="1" DOCID="JA-980512181" SCORE="1.00" />
      <DOCUMENT RANK="2" DOCID="JA-980531170" SCORE="0.99" />
    </IR4QA_RESULT>
  </TOPIC>

</TOPIC_SET>

IR4QAFormat (last edited 2010-01-02 19:32:58 by HidekiShima)