一直想整理一下NLP相关的算法,但就是懒... ...
1、环境准备
Python >= 3.5
Tensorflow >= 1.10
#个人使用:python==3.6
# tensorflow==1.13.1
pip3 install --upgrade tensorflow==1.13.1
bert安装:
【推荐】
pip3 install bert-base==0.0.9 -i https://pypi.python.org/simple
# 或者源码安装
git clone https://github.com/macanv/BERT-BiLSTM-CRF-NER
cd BERT-BiLSTM-CRF-NER/
python3 setup.py install
2、模型下载
整理好了一份:链接:https://pan.baidu.com/s/1xrlxgAZesEaBv3BDRqRpJA 密码:y4j1
目录结构:
或者在github中找链接下载:https://github.com/macanv/BERT-BiLSTM-CRF-NER
3、使用
3.1、bert服务启动:
modePath=/Users/Davide/resource_test # 你存放模型的目录
bert-base-serving-start -model_dir $modePath/Bert_NER -bert_model_dir $modePath/chinese_L-12_H-768_A-12 -model_pb_dir $modePath/Bert_NER -mode NER -port=7777 -port_out=7778
# 注意:我这边指定了端口为 7777 和 7778
3.2、实体识别:
import time
from bert_base.client import BertClient
# 注意:这个地方要更改默认的ip和端口
with BertClient(ip='localhost', port=7777, port_out=7778,show_server_config=False, check_version=False, check_length=False, mode='NER') as bc:
start_t = time.perf_counter()
str_list = ['特朗普团队和特朗普基金会','中英法俄会谈。']
rst_list = bc.encode(str_list)
for sent,rst in zip(str_list,rst_list):
for word,ner in zip(list(sent),rst):
print(word,ner)
print()
print(time.perf_counter() - start_t)
输出:
一定记得查看ip和port是否正确,否则没有结果!默认是5555 和 5556
参考文献:
https://github.com/macanv/BERT-BiLSTM-CRF-NER
https://github.com/google-research/bert
https://github.com/kyzhouhzau/BERT-NER
https://github.com/zjy-ucas/ChineseNER
---------------------------侵权删---------------------------