Thanks to visit codestin.com
Credit goes to github.com

Skip to content

llda.py: question on output #7

@christofs

Description

@christofs

Hello Shuyo,

Thanks for putting this code online! I got it to work with my own data, which includes several labels, as in the following start of a line (each text has 5 labels, some of which are unique, others are recurring):
[Ponson,criminel,1850s,ExploitsRocambole1,rp169] brick commerce route nœud heure temps brise ...

Purely for testing, I used a small collection of just 20 rather long texts, and used only 5 iterations and 10 topics (of course this is not enough for serious results). But now I'm not sure how to interpret and further use the output. There seem to be two outputs (lines 145 and 146).

For the first one, I get something like this (just zeros with the occasional 1):
someword 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0

For the second one, I get something like:
someword,1.98242060772e-08,1.35954869509e-07,3.20370912628e-07,2.61753929319e-07,4.15044189755e-07,3.13934090166e-07,4.24199387286e-07,7.01074466728e-07,2.85357018727e-07,6.1751952288e-07,3.20370912628e-07, ...

What is the difference between the two results? Are these per-word scores for each label? The length of this list of scores is identical to the length of the "labelmap.keys()" dict of labels; how do they match up? Or are these values something else?

Thanks for any hints, and best wishes.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions