导读 | Apache OpenNLP 1.8.0 发布了,OpenNLP 是一个机器学习工具包,用于处理自然语言文本。支持大多数常用的 NLP 任务,例如:标识化、句子切分、部分词性标注、名称抽取、组块、解析等。 |
此版本带来了许多新功能、改进和错误修复。API 已经得到改进以获得更好的一致性,并且删除了许多不被赞同的方法。
更新如下:
- POS Tagger context generator now supports feature generation XML
- Add a Name Finder feature generator that adds POS Tag features
- Add CONLL-U format support
- Improve default Name Finder settings
- TokenNameFinderEvaluator CLI now support nameTypes argument
- Stupid backoff is now the default in NGramLanguageModel
- Language codes now are ISO 639-3 compliant
- Add many unit tests
- Distribution package now includes example parameters file
- Now prefix and suffix feature generators are configurable
- Remove API in Document Categorizer for user specified tokenizer
- Learnable lemmatizer now returns all possible lemmas for a given word and pos tag
- Lemmatizer API backward compatibility break: no need to encode/decode lemmas anymore, now LemmatizerME lemmatize method returns the actual lemma
- Add stemmer, detokenizer and sentence detection abbreviations for Irish
- Chunker SequenceValidator signature changed to allow access to both token and POS tag
下载地址:https://opennlp.apache.org/download.html
原文来自:https://www.oschina.net/news/85001/apache-opennlp-1-8-0
本文地址:https://www.linuxprobe.com/apache-opennlp-1p8p0.html编辑员:杨鹏飞,审核员:逄增宝
本文原创地址:https://www.linuxprobe.com/apache-opennlp-1p8p0.html编辑:public,审核员:暂无