-
-
Notifications
You must be signed in to change notification settings - Fork 31
Description
Hi Antonin
I am working on Opentapioca to improve its accuracy to some extend in order to apply it in our project.
I tried to use other features besides the current features of the vectors.
connection_count: connection_count(tagi) = sum(tagi.edges.intersection(hrtag j)/ hrtag j.edges), hrtag j is the tag with the highest rank among the detected tags for phrase j.
hop_count: hop_count(tagi)= sum(1- tagi.edges.intersection(tagj)/ tagi.edges.union(tagj)), j is any detected tag for any phrase in the input sentence.
cosine_similarity: applying S-Bert to generate embeddings of descriptions of tag-candidates and the input sentence and then using cosine similarity between the generated vectors.
I also used XGBoost ranker(learn to rank) instead of SVM classifier.
None of mentioned solutions fulfilled increasing F1.
Do you have any suggestion for me?