Open Search Server(OSS )是一个基于Java编写的开源搜索引擎和全文搜索算法套件。可多语言对文档进行索引。多语言分析器将句子切成词,然后基于文档的语言将lemmatisation算法运用在词语之上。支持多种文档格式包括:XML、HTML、PDF、Word和PowerPoint等。此外还拥有一个便于操作的Web操作界面。
官网下载:https://cloud.opensearchserver.com/opensearchserver#download
https://www.opensearchserver.com/
https://www.opensearchserver.com/documentation/README.md
https://www.opensearchserver.com/documentation/installation/linux.md
-----------------------------------------------------------------------------------------
Open-source Enterprise Grade Search Engine Software
OpenSearchServer
OpenSearchServer is a powerful, enterprise-class, search engine software based on Lucene. Using the web user interface, the crawlers (web, file, database, ...) and the JSON webservice you will be able to integrate quickly and easily advanced full-text search capabilities in your application. OpenSearchServer runs on Linux/Unix/BSD/Windows.
Quickstart
Go with the interface and/or the API
Useful links
- Download binaries: https://www.opensearchserver.com/#download
- The documentation: https://www.opensearchserver.com/documentation
- Issues (bugs, enhancements): https://github.com/jaeksoft/opensearchserver/issues
Features
Search functions
- Advanced full-text search features
- Phonetic search
- Advanced boolean search with query language
- Clustered results with faceting and collapsing
- Filter search using sub-requests (including negative filters)
- Geolocation
- Spell-checking
- Relevance customization
- Search suggestion facility (auto-completion)
Indexation
- Supports 18 languages
- Fields schema with analyzers in each language
- Several filters: n-gram, lemmatization, shingle, stripping diacritic from words,…
- Automatic language recognition
- Named entity recognition
- Word synonyms and expression synonyms
- Export indexed terms with frequencies
- Automatic classification
Document supported
- HTML / XHTML
- MS Office documents (Word, Excel, Powerpoint, Visio, Publisher)
- OpenOffice documents
- Adobe PDF (with OCR)
- RTF, Plaintext
- Audio files metadata (wav, mp3, AIFF, Ogg)
- Torrent files
- OCR over images
Crawlers
- The web crawler for internet, extranet and intranet
- The file systems crawler for local and remote files (NFS, SMB/CIFS, FTP, FTPS, SWIFT)
- The database crawler for all JDBC databases (MySQL, PostgreSQL, Oracle, SQL Server, …)
- Filter inclusion or exclusion with wildcards
- Session parameters removal
- SQL join and linked files support
- Screenshot capture
General
- JSON web service
- Index replication and sharding
- Federated search
from https://github.com/jaeksoft/opensearchserver
----------------------------------------------------------------
How to build OpenSearchServer
Would you like to contribute to OpenSearchServer?
Here is how to compile and build OSS.
Prerequisites
Here are the tools you need to build OpenSearchServer:
Extract the source code using GIT
The default and currently active branch is 1.5.
git clone https://github.com/jaeksoft/opensearchserver
Go to the opensearchserver directory
cd opensearchserver
Use Maven to build the jar, war, deb and rpm package
mvn -Dgpg.skip=true package clean package rpm:attached-rpm
Use Ant to build the zip and tar.gz package
The archive includes Apache Tomcat, as well as the start and stop scripts.
ant clean dist dist-src
The built zip and tar.gz archive are available here:
dist/opensearchserver.tar.gz
dist/opensearchserver.zip
Alternatively, you can download these packages at SourceForge.
from https://www.opensearchserver.com/documentation/building_opensearchserver.md
-------------------------------------------------
类似的程序Elasticsearch:
https://briteming.blogspot.com/2016/06/javaelasticsearch.html
https://briteming.blogspot.com/2022/07/elasticsearch-java.html
No comments:
Post a Comment