-
Notifications
You must be signed in to change notification settings - Fork 292
Description
Hello Shuyo,
Thanks for putting this code online! I got it to work with my own data, which includes several labels, as in the following start of a line (each text has 5 labels, some of which are unique, others are recurring):
[Ponson,criminel,1850s,ExploitsRocambole1,rp169] brick commerce route nœud heure temps brise ...
Purely for testing, I used a small collection of just 20 rather long texts, and used only 5 iterations and 10 topics (of course this is not enough for serious results). But now I'm not sure how to interpret and further use the output. There seem to be two outputs (lines 145 and 146).
For the first one, I get something like this (just zeros with the occasional 1):
someword 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
For the second one, I get something like:
someword,1.98242060772e-08,1.35954869509e-07,3.20370912628e-07,2.61753929319e-07,4.15044189755e-07,3.13934090166e-07,4.24199387286e-07,7.01074466728e-07,2.85357018727e-07,6.1751952288e-07,3.20370912628e-07, ...
What is the difference between the two results? Are these per-word scores for each label? The length of this list of scores is identical to the length of the "labelmap.keys()" dict of labels; how do they match up? Or are these values something else?
Thanks for any hints, and best wishes.