-
Notifications
You must be signed in to change notification settings - Fork 18
Description
I am trying to download a set of samples based on metadata information. When I search with my parameters, I find a certain number of samples; but when I pipe those results into 'redbiom fetch' (with a particular context) it downloads a different number of samples. I think there is a similar problem when I pipe the search results into 'redbiom summarize contexts'; it shows a list of contexts, some of which are associated with my samples but some of which are not, and I have to guess which one I have to use for fetching. So I have two questions: 1) How can I see the contexts associated only with my searched samples? and 2) How can I only fetch the samples associated with my metadata search? See below for the problems associated with question 2.
Looking for marine water samples within the EMP
% redbiom search metadata "where qiita_study_id == 13114 and empo_4 == 'Water (saline)'" | wc -l
39
Defining a context based on previous search results (it took several attempts to find one that worked)
% echo $CTX
Deblur_2021.09-Illumina-16S-V4-150nt-ac8c0b
Fetching samples based on metadata and context
% redbiom search metadata "where qiita_study_id == 13114 and empo_4 == 'Water (saline)'" | redbiom fetch samples --context $CTX --output EMP_marine_samples.biom
38 sample ambiguities observed. Writing ambiguity mappings to: EMP_marine_samples.biom.ambiguities
Data summary shows many more samples than metadata search originally found
% biom summarize-table -i EMP_marine_samples.biom | head
Num samples: 97
Num observations: 16,547
Total count: 1,354,853
Table density (fraction of non-zero values): 0.030
Counts/sample summary:
Min: 4,111.000
Max: 38,769.000
Median: 12,268.000
Mean: 13,967.557