Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Tags: ropensci/textreuse

Tags

v0.1.5

Toggle v0.1.5's commit message
Version 0.1.5

v0.1.4

Toggle v0.1.4's commit message
Version 0.1.4

v0.1.3

Toggle v0.1.3's commit message
Version 0.1.3

v0.1.2

Toggle v0.1.2's commit message
Version 0.1.2

v0.1.1

Toggle v0.1.1's commit message
Version 0.1.1

v0.1.0

Toggle v0.1.0's commit message
Initial CRAN release

v0.0.1.9006

Toggle v0.0.1.9006's commit message
Store minhashes separately from hashes

Keeping minhashes and hashes in the same element of the list that makes
up a TextReuseTextDocument was a mistake. The two are not at all the
same, conceptually. Furthermore, the entire advantage of hashing the
tokens is that the tokens can be discarded. But if one needs to rehash,
the tokens need to stick around or be retokenized, which is the most
expensive part of the process in terms of memory and computation.

This function adds an element `x$minhashes` to TextReuseTextDocuments,
provides appropriate accessor and existence functions, make
functions like `lsh()` only use minhashes instead of hashes, and
rewrites documentation and vignettes as appropriate.

The README is expanded to describe the three main kinds of analysis.

v0.0.1.9005

Toggle v0.0.1.9005's commit message
Version 0.0.1.9005

v0.0.1.9004

Toggle v0.0.1.9004's commit message
Version 0.0.1.9004