This is a project for later lazy work!
Only support for python3,
命令行直接安装
pip install poros从代码库安装
git clone https://github.com/diqiuzhuanzhuan/poros.git
cd poros
python setup installSome code is created by myself, and some code is inspired by others, such as allennlp etc.
Provide a set of small functions
usage:
- convert Chinese words into Arabic numbers:
from poros.poros_chars import chinese_to_arabic
>>> print(chinese_to_arabic.NumberAdapter.convert("四千三百万"))
43000000from poros.poros_loss import GravityLoss
>>> gl = GravityLoss()
# [1, 2]
>>> input_a = torch.tensor([[1.0, 1]], requires_grad=True)
>>> input_b = torch.tensor([[1.0, 1]], requires_grad=True)
>>> target = torch.tensor([[4.0]])
>>> output = gl(input_a, input_b, target)
>>> torch.testing.assert_close(output, target)from poros.poros_common.params import Params
# at first, choose an embedding algorithm
>>> sentence_embedding_params = Params({
'type': 'sentence_transformers_model',
'model_name_or_path': 'albert-base-v1'
})
# secondly, specify which clustering algorithm you want to conduct
>>> clustering_algorithm_params = Params({
'type': 'graph_based_clustering',
'similarity_algorithm_name': 'cosine',
'similarity_algorithm_params': None,
'community_detection_name': 'louvain',
'community_detection_params': {
'weight': 'weight',
'resolution': 0.95,
'randomize': False
}
})
# finally, build your clustering model
>>> intent_clustering_params = Params({
'type': 'baseline_intent_clustering_model',
'clustering_algorithm': clustering_algorithm_params,
'embedding_model': sentence_embedding_params
})
>>> intent_clustering_model = IntentClusteringModel.from_params(params=intent_clustering_params)PyCharm, Mircosoft Visual Studio Code