improvment

Hi Antonin
I am working on Opentapioca to improve its accuracy to some extend in order to apply it in our project.
I tried to use other features besides the current features of the vectors.
**connection_count**: connection_count(tagi) = sum(tagi.edges.intersection(hrtag j)/ hrtag j.edges), hrtag j is the tag with the highest rank among the detected tags for phrase j.
**hop_count**: hop_count(tagi)= sum(1- tagi.edges.intersection(tagj)/ tagi.edges.union(tagj)), j is any detected tag for any phrase in the input sentence.
**cosine_similarity**: applying S-Bert to generate embeddings of descriptions of tag-candidates and the input sentence and then using cosine similarity between the generated vectors.
I also used **XGBoost ranker**(learn to rank) instead of SVM classifier.
None of mentioned solutions fulfilled increasing F1.
Do you have any suggestion for me?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

improvment #37

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

improvment #37

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions