课外天地 李树青学习天地信息检索原理课件 → XML Searching Resources


  共有28744人关注过本帖树形打印复制链接

主题:XML Searching Resources

帅哥哟,离线,有人找我吗?
admin
  1楼 博客 | 信息 | 搜索 | 邮箱 | 主页 | UC


加好友 发短信 管理员
等级:管理员 帖子:1939 积分:26594 威望:0 精华:34 注册:2003/12/30 16:34:32
XML Searching Resources  发帖心情 Post By:2009/5/26 15:44:35 [只看该作者]

For background information, see the XML and Search page. This page provides annotated links to XML Search, Query Languages, Standards and Document Structure Registries information.

XML Text Search Engines

  • Amberfish - now supports both XML full text search with Boolean and phrase matching, as well as arbitrarily structured queries with hierarchical results.
  • IXIASOFT TEXTML database -XML database with real-time index synchronization, grouping of elements and attributes in indexes. Search features include Boolean, proximity and frequency operators. Results sorting, site is not clear on their relevance algorithm.
  • Infonyte Query - Implements the XQL Specification for both structured and free-text search.
  • Tamino - indexes based on XML tags for full-text or structured searching, supports XPath query features, from Software AG.
  • Emily Solutions Framework (MLE) includes a search engine for XML, flat files and relational databases.
  • Ultraseek version 3 and later support searching XML files as full text, and setting up multiple sets of fields for field-specific searching, version 4 adds indexing of attributes.
  • SIM (The Structured Information Manager) uses XML and full-text indexing to provide a powerful combination of database searching and text retrieval.
  • GoXML Search Engine - does XML-specific search by providing a second step for the query with a popup menu of "context": the markup tag for the text. There does not seem to be a way to perform free-text searching as well, but it does seem to index by using a robot and spidering external sites.

XML Structured Query Engines

  • X-Hive - XML database implementing XQuery standard.
  • XDisect - indexes and retrieves arbitrary XML data structures and supports flexible queries. Supports XML standards.
  • fxgrep - parses an XML document for complex queries and finds all matches. Distributed as source code.
  • GMD-IPSI XQL Engine - Java-based persistent storage of XML documents in DOM format, XQL queries into the data store. Free for noncommercial use: see Globit Infonyte for the commercial version.
  • XSet is an XML database and high performance search engine library with a very simple tag-oriented query language, expressed in XML itself.
  • UC Berkeley's Cheshire system indexes and searches structured data including XML, SGML and MARC records, as well as full-text data files.
  • Xtenint information routing and selection agents perform real-time filtering of data streams using various predefined rules.
  • Lore - a database management system for XML using semistructured data,with a special query language called Lorel. This is a research project in the Stanford University computer science department.
  • sgrep search tool provides grep searching for structured documents, including XML. This is a command-line program that lets you limit searches to contents of specified XML tags, such as "subject" or "evaluation". As of January, 2000, it has been updated to version 1.9 and now supports searching in hierarchies.

XML Query Languages

See also "How Text Search Relates to XML Query Languages"

  • W3C XML Query Main page for the World Wide Web Consortium XML Query Working Group.
    • XML Query Requirements (Current Working Document) New release on February 16, 2001 This URL points to the most current version of the XML Query Working Group requirements document for how the standard XML query language should work, with general goals, usage scenarios, terminology, data model information, functionality, and how the language should fit with the other XML standards.
    • XQuery: A Query Language for XML New release on February 15, 2001 Implementation of a query language based on the requirements.
  • XML Query Languages: Experiences and Exemplars Summer, 1999 Examines four existing XML query languages in order to define the requirements for such a language. Tends to be more SQL-oriented than search-engine oriented.
  • QL '98: the Quest for an XML Query Standard Lisa Rein, XML.com, March 2, 1999 Covers the discussion at the W3C Query Language workshop in December 1998. There were more participants than expected, from XML/SGML implementors through database companies, librarians and researchers. Discussion there was about database-style querying providing selection (search), extraction, reduction, restructuring and combination of XML text.. Some participants want to make a query language like SQL or OQL, others see a relationship to research in "semistructured" data retrieval, while others hope to use XSL to perform these tasks. There was controversy over what the results should look like, and about if they should come in XML only or in other formats.
  • QL '98 Position Papers W3C 66 papers from the participants in the W3C Query Language workshop in December 1998.
  • W3C XML Query Language mailing list (for the general public), with open archives. To subscribe, send a message to www-ql-reqest@w3c.org and put the word subscribe in the body of the message (remove your signature).

    XQL

    • XQL Proposal submitted to the W3C, proposes a query language for "addressing and filtering the elements and text of XML documents" as an extension of the XSL pattern syntax.
    • XQL FAQ - links and background information for this proposal, list of the applications that have implemented it so far.
    • XQL Tutorial - simple explanation with examples

XML-QL

    • XML-QL, Query Language for XML Proposal submitted to W3C. Proposes a query language approaching an XML file much like searching a database with SQL, rather than a free-text document. The focus of this proposal is EDI (Electronic Data Interchange, business-to-business purchasing) as opposed to a library or information retrieval approach. It provides examples using specific data and element patterns and constructing new results listings

XML Standards and Overviews

XML Registries: Standardizing Document Structures

Groups from genealogists to real-estate brokerages want to share data without having to use complex file format converters, and they are banding together to create open XML DTDs and other structures. Today, the only standard way of structuring XML documents is with DTDs (document type descriptors). But there are several proposals for richer and more complex systems, known as schema. In any case, shared structures will improve searching as all data will be in the same format and will not require special field names.

Other Listings

[此贴子已经被作者于2009-5-26 15:45:53编辑过]

 回到顶部