Possible issue with ldaFunction when class variable is not coded as 0/1

Hello,

I am a beginner in bioinformatics and still learning the standard practices, so I may be missing something, but I wanted to report a behavior I observed when using lefser’s ldaFunction.

When the class variable is a factor with two levels (e.g., "CL_B" and "CL_A") instead of numeric values 0 and 1, the calculation of effect_size appears to fail.
In the code, effect_size is computed as:

effect_size <- abs(mean(LD[data[,"class"] == 1]) - mean(LD[data[,"class"] == 0]))

This seems to assume that the class values are literally 1 and 0.
If they are not (for example, factors or characters), the comparison data[,"class"] == 1 always returns FALSE, resulting in an effect_size of 0 or NA, and thus invalid LDA results.

The main lefser function does not seem to automatically convert character/factor class labels into numeric values before passing them to ldaFunction. The class is converted to a factor but then added as-is to the dataframe, which leads to effect_size not being calculated as intended.
I could fix this locally by recoding the classes to numeric (0 and 1), but I just wanted to check if:

This behavior is expected (and I missed it in the documentation)

Or if it would make sense to update the function to handle factors directly or issue a warning

Thank you very much for your work on this package, and apologies if I am overlooking a standard workflow here.


> str(relab_sub_t_df$class)
 Factor w/ 2 levels "OCE22","OCE23": 1 1 1 1 1 1 1 1 1 1 ...
> 
> test_data <- relab_sub_t_df
> test_data$class <- ifelse(test_data$class == "OCE22", 0, 1)
>
> str(test_data$class)
 num [1:24] 0 0 0 0 0 0 0 0 0 0 ...
> 
> 
> A <- ldaFunction(test_data)
> B <- ldaFunction(relab_sub_t_df)
>
> mean(  max(abs(A - B)) / abs(mean(c(A,B)))  )
[1] 0.02602692

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible issue with ldaFunction when class variable is not coded as 0/1 #88

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible issue with ldaFunction when class variable is not coded as 0/1 #88

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions