转载

Lucene 5.3.0 发布,Java 全文搜索框架

Lucene 5.3.0 发布 ,此版本值得关注的更新如下:

API 改进:PhraseQuery 和 BooleanQuery 不可变

新特性:

  • Added a new org.apache.lucene.search.join.CheckJoinIndex class that can be used to validate that an index has an appropriate structure to run join queries

  • Added a new BlendedTermQuery to blend statistics across several terms

  • New common suggest API that mirrors Lucene's Query/IndexSearcher APIs for Document based suggester.

  • IndexWriter can now be initialized from an already open near-real-time or non-NRT reader

  • Add experimental range tree doc values format and queries, based on a 1D version of the spatial BKD tree, for a faster and smaller alternative to postings-based numeric and binary term filtering.  Range trees can also handle values larger than 64 bits.

Geo 相关特性和改进:

  • Added GeoPointField, GeoPointInBBoxQuery, GeoPointInPolygonQuery for simple "indexed lat/lon point in bbox/shape" searching

  • Added experimental BKD geospatial tree doc values format and queries, for fast "bbox/polygon contains lat/lon points"

  • Use doc values to post-filter GeoPointField hits that fall in boundary cells, resulting in smaller index, faster searches and less heap used for each query

优化:

  • Reduce RAM usage of FieldInfos, and speed up lookup by number, by using an array instead of TreeMap except in very sparse cases

  • Faster intersection of the terms dictionary with very finite automata, which can be generated eg. by simple regexp queries

  • Various bugfixes and optimizations since the 5.2.0 release.

下载: http://www.apache.org/dyn/closer.cgi/lucene/java/5.3.0

Lucene 是apache软件基金会一个开放源代码的全文检索引擎工具包,是一个全文检索引擎的架构,提供了完整的查询引擎和索引引擎,部分文本分析引擎。 Lucene的目的是为软件开发人员提供一个简单易用的工具包,以方便的在目标系统中实现全文检索的功能,或者是以此为基础建立起完整的全文检索引擎。

Lucene 最初是由Doug Cutting所撰写的,是一位资深全文索引/检索专家,曾经是V-Twin搜索引擎的主要开发者,后来在Excite担任高级系统架构设计师,目前从事 于一些INTERNET底层架构的研究。他贡献出Lucene的目标是为各种中小型应用程式加入全文检索功能。

OSChina 使用 Lucene 实现全文搜索。

在线Javadoc: http://tool.oschina.net/apidocs/apidoc?api=lucene-3.6.0

正文到此结束
Loading...