-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
[R] add binding name as class #3993
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
[R] add binding name as class #3993
Conversation
|
That is good but shouldn't we also provide a few stub member functions for the classes? And/or to make our life easier make this 'one superclass' (with catch-all implementation, maybe one nicer than one ALso, what is the list of generated classes? Do we need Adding the one-liner is good but does not move the needle that far. |
We can drop and have the following # add subclass and class
class(res) <- c("knn", "mlpack")
class(res)
# [1] "knn" "mlpack" |
|
Sure. But we have no methods for classes To make this concrete, run > class(cl2) <- c("kMeans", "mlpack")
> str(cl2)
List of 2
$ clusters: int 3
$ result : num [1, 1:31] 1 1 1 1 2 2 1 2 2 2 ...
- attr(*, "class")= chr [1:2] "kMeans" "mlpack"
> print(cl2)
$clusters
[1] 3
$result
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20] [,21] [,22] [,23] [,24] [,25] [,26] [,27] [,28] [,29] [,30] [,31]
[1,] 1 1 1 1 2 2 1 2 2 2 2 2 2 1 2 2 2 2 2 1 2 2 2 2 0 0 0 0 0 0 0
attr(,"class")
[1] "kMeans" "mlpack"
> So I see your point. I behaves like a |
|
I'm less worried on the hierarchy, e.g. From what I recall, the original issue was we were just returning: By salting the object with properties, downstream packages can take advantage using: is_knn_mlpack <- function(x) {
inherits(x, "knn") && inherits(x, "mlpack")
}That said, the current modus operandi makes ample use of model_serialization_function <-
switch(attributes(model)$type,
"GaussianKernel" = SerializeGaussianKernelPtr, # Populated by a print c++ binding
...
)https://github.com/cran/mlpack/blob/956bbce2f3f444aa8adff8cd91eab31a3088f0de/R/serialization.R#L9 So, I see this being more beneficial for |
|
Methods will be the next step; the rationale was provided at the opening of #3722. I was looking for native serialisation method back then (we don't have) and also the specific routines are not exported, and on top of that the model outputs are mere # New S3 generic for native serialisation
serialise_mlpack <- function(object, ...) {
UseMethod("serialise_mlpack")
}
# Optional
serialise_mlpack.default <- function(object, ...) {
stop(sprintf("No method for object %s. See ?serialise_mlpack for details.",
sQuote(deparse(substitute(object)))))
}
# method for KNNModel
serialise_mlpack.knn <- function(object, ...) {
mlpack:::SerializeKNNModelPtr(object$output_model)
} |
|
Yes, that could be reduced to: # Using serialise_mlpack() method internally
Serialize <- function(object, filename) {
model_serialization <- serialise_mlpack(object)
con <- file(as.character(filename), "wb")
serialize(model_serialization, con)
close(con)
}
res2 <- mlpack::knn(query = x, reference = x, k = 3)
Serialize(res2, filename = "Hello.rds")But still need to auto-generate serialise_mlpack.knn <- function(object, ...) {
mlpack:::SerializeKNNModelPtr(object$output_model)Unless we do it directly in |
|
@cgiachalis @eddelbuettel @coatless are we happy with the change here? From the mlpack/C++ side it looks good to me, and I agree in principle that returning a class can be useful for exactly the inheritance-based reasons proposed. I have a couple comments code-wise; up a bit higher in So, that's what's used to print the documentation about what the output is. But now we are returning a class, not a list; does it make sense in the R lexicon to change this to something like We should also add a note to |
We're still returning a list structure, but we're adding extra elements as sub-classes; as @coatless wrote the hierarchy will be binding > mlpack > list. # before
class(res)
# [1] "list"
# after
class(res)
# [1] "knn" "mlpack" "list"
|
|
I agree in the technical sense, as a |
|
Yes definitely, one suggestion is A list object of class mlpack with several components. |
|
@rcurtin R semantics. While we're now returning several "classes" instead of a base list, users are still interacting with it as a list: accessing components via Given that, I think we could modify the documentation to: This accurately describes both the user-facing list interface and the added class structure. The Agree on the |
We might want to keep mlpack because we're adding classes to all exported objects: So a function like "preprocess_binarize" "mlpack" "list" |
|
(Just wanted to note that for faster iteration / testing we could work all this out in an ad-hoc throw-away package with just R code that just adds S3 wraps around what the real |
|
https://github.com/cgiachalis/mlpack-test That's the package was generated from the branch https://github.com/cgiachalis/mlpack/tree/3722-r-add-binding-name-as-class |
|
That was fast. I was even thinking way skinnier ie x <- mlpack::mlpack_some_model_here(some arge) # boot strap an object
class(x) <- c("foo", "bar", "mlpack", "list") # as neededBut I did not make myself very clear. Sorry 'bout that. |
That's OK. But you can peruse and see what has been generated. |
|
Sounds good, whenever things are ready for a review here I will happily oblige. 👍 The deeper R details will be lost on me. |
PR SummaryThis PR modifies Examples: # Serialisable model
mlpack::adaboost()
# Add binding name as class to the output.
class(out) <- c("adaboost", "mlpack", "list") # No Serialisable model
mlpack::pca()
# Add binding name as class to the output.
class(out) <- c("pca", "mlpack", "list")
For more outputs see the generated package from In the above examples, A quick demonstration - now is easier to create a S3 method to extract the # model_type method
model_type.mlpack <- function(object) {
attr(object$output_model, "type")
}
# -- -- --
res <- mlpack::knn(query = x, reference = x, k = 3)
model_type(res)
# [1] "KNNModel"
Documention (todo)As per @coatless suggestion with a small change: Unit test (todo) res <- mlpack::knn(query = x, reference = x, k = 3)
testthat::expect_s3_class(res, c("knn", "mlpack", "list"))
|
|
I am loosing my marbles. The PR is over file src/mlpack/bindings/R/print_R.cpp but not such file is in the one-off repo we created to look at / extend the PR. Am I forgetting how the code generator works? |
|
Checking at CRAN Is it because they're not listed in mlpack/src/mlpack/bindings/R/CMakeLists.txt Lines 203 to 216 in cb9fa41
|
|
Maybe because CRAN complains about use of |
|
No worries on the marbles, mine are long gone too. Quick refresher:
So, whatever you are doing in the test package, I assume you will get the R files looking like you want, and then following that we will make the appropriate changes here such that the generated R files match the hand-crafted ones. |
|
I think this should largely be fine as-is. The main component is tweaking the underlying class names, e.g. # Present
class(out) <- c("list")To: # Proposed
class(out) <- c("adaboost", "mlpack", "list")The main change for this is maybe namespacing model name and splitting this to be a specific binding, e.g. # Alternative Proposal.
class(out) <- c("mlpack_adaboost", "mlpack_model_binding", "list")Again, to reiterate, it's fine to keep the "Proposed" version that is contained in this PR. |
|
Namespacing is a good idea! |
|
Is this one ready and waiting for review, or is there more to do? Sorry if I should have reviewed it and dropped the ball. |
I'm promoting @cgiachalis' branch changes discussed in the issue ticket #3722 over to a PR for ease of testing.