Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Possible issue with ldaFunction when class variable is not coded as 0/1 #88

@nnmbr

Description

@nnmbr

Hello,

I am a beginner in bioinformatics and still learning the standard practices, so I may be missing something, but I wanted to report a behavior I observed when using lefser’s ldaFunction.

When the class variable is a factor with two levels (e.g., "CL_B" and "CL_A") instead of numeric values 0 and 1, the calculation of effect_size appears to fail.
In the code, effect_size is computed as:

effect_size <- abs(mean(LD[data[,"class"] == 1]) - mean(LD[data[,"class"] == 0]))

This seems to assume that the class values are literally 1 and 0.
If they are not (for example, factors or characters), the comparison data[,"class"] == 1 always returns FALSE, resulting in an effect_size of 0 or NA, and thus invalid LDA results.

The main lefser function does not seem to automatically convert character/factor class labels into numeric values before passing them to ldaFunction. The class is converted to a factor but then added as-is to the dataframe, which leads to effect_size not being calculated as intended.
I could fix this locally by recoding the classes to numeric (0 and 1), but I just wanted to check if:

This behavior is expected (and I missed it in the documentation)

Or if it would make sense to update the function to handle factors directly or issue a warning

Thank you very much for your work on this package, and apologies if I am overlooking a standard workflow here.

str(relab_sub_t_df$class)
Factor w/ 2 levels "OCE22","OCE23": 1 1 1 1 1 1 1 1 1 1 ...

test_data <- relab_sub_t_df
test_data$class <- ifelse(test_data$class == "OCE22", 0, 1)

str(test_data$class)
num [1:24] 0 0 0 0 0 0 0 0 0 0 ...

A <- ldaFunction(test_data)
B <- ldaFunction(relab_sub_t_df)

mean( max(abs(A - B)) / abs(mean(c(A,B))) )
[1] 0.02602692

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions