SA_MetaMatch: Relevant document discovery through document metadata and indexing

Show simple item record

dc.contributor.author Yau, Hiu
dc.contributor.author Hawker, J. Scott
dc.date.accessioned 2008-12-04T21:43:42Z
dc.date.available 2008-12-04T21:43:42Z
dc.date.issued 2004
dc.identifier.isbn 1-58113-870-9
dc.identifier.uri http://hdl.handle.net/1850/7627
dc.description 2004 ACM Southeast conference proceedings. This is a digitized copy derived from an ACM-copyrighted work. ACM did not prepare this copy and does not guarantee that it is an accurate copy of the originally published work.
dc.description.abstract SA_MetaMatch, a component of the Standards Advisor (SA), is designed to find relevant documents through matching indices of metadata and document content. The elements in the metadata schema are mainly adopted from the Dublin Core (DC). The implementation of the XML metadata schema and coding follows the DC recommended guidelines. After metadata is generated manually for an unstructured document, or is extracted automatically from documents of well defined layout, they are stored in metadata files or in a repository. The indices of the descriptive metadata elements and that of the document content are generated. They are searched and compared to find related documents, based on our observation that if the metadata and high frequency index words of document content are related, then the corresponding documents are likely to be related as well. A ranked list of possible relevant documents is returned as the result. Several matching algorithms have been explored. We selected a sum of word-scored approach which not only gives relevant scores for the matched documents, but also gives an individual score for each of the matching words which provide hints for domain experts to grasp the concepts in the documents.
dc.publisher Association for Computing Machinery (ACM)
dc.title SA_MetaMatch: Relevant document discovery through document metadata and indexing
dc.type Proceedings
dc.subject.keyword Document matching
dc.subject.keyword Dublin Core
dc.subject.keyword Index
dc.subject.keyword Metadata

Files in this item

Files Size Format View
HYauProceedings2004.pdf 429.2Kb PDF View/Open

This item appears in the following Collection(s)

Show simple item record

Search RIT DML


Advanced Search

Browse