Thanks to visit codestin.com
Credit goes to github.com

Skip to content

bool category not properly parsed #27

@sjanssen2

Description

@sjanssen2

Assume I have a metadata category like infection with values TRUE or FALSE. If I load these data as in your example metadata = pd.read_table("data/metadata.tsv", sep="\t", index_col=0) they are of type object and proper boolean values, i.e. True and False. If I would add a dtype=str, the values are still of type object but strings, namely 'TRUE' and 'FALSE'.

Only the dtype=str way works for me. Otherwise evident throws the error:

---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
~/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3628             try:
-> 3629                 return self._engine.get_loc(casted_key)
   3630             except KeyError as err:

~/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

~/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/pandas/_libs/index.pyx in pandas._libs.index.IndexEngine.get_loc()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

pandas/_libs/hashtable_class_helper.pxi in pandas._libs.hashtable.PyObjectHashTable.get_item()

KeyError: 'infection'

The above exception was the direct cause of the following exception:

KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_1855806/1849832882.py in <module>
      1 for cat in ["birth_timestamp","cage","genotype","infection"]:
----> 2     print(adh.calculate_effect_size(column=cat))

~/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/evident/data_handler.py in calculate_effect_size(self, column, difference)
    112         :rtype: evident.results.EffectSizeResult
    113         """
--> 114         if self.metadata[column].dtype != np.dtype("object"):
    115             raise exc.NonCategoricalColumnError(self.metadata[column])
    116 

~/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/pandas/core/frame.py in __getitem__(self, key)
   3503             if self.columns.nlevels > 1:
   3504                 return self._getitem_multilevel(key)
-> 3505             indexer = self.columns.get_loc(key)
   3506             if is_integer(indexer):
   3507                 indexer = [indexer]

~/miniconda3/envs/qiime2-2022.8/lib/python3.8/site-packages/pandas/core/indexes/base.py in get_loc(self, key, method, tolerance)
   3629                 return self._engine.get_loc(casted_key)
   3630             except KeyError as err:
-> 3631                 raise KeyError(key) from err
   3632             except TypeError:
   3633                 # If we have a listlike key, _check_indexing_error will raise

KeyError: 'infection'

You might want to return a more explicit error message in those cases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions