Fig. 5
From: Chimeric mis-annotations of genes remain pervasive in eukaryotic non-model organisms
Some ‘uncharacterised’ proteins in the UniRef50 database have characteristic patterns of mis-annotation. The SwissProt database was used as the ‘target’ and the UniRef50 database was used as the ‘query’ for a sequence search using mmseqs easy-search. The UniRef50 database was then filtered to keep only sequences with ‘uncharacterized’ in their name and the same search was conducted with SwissProt as the ‘target’ database. The proportion of query and target coverage for each hit are shown on a 2D binned histogram. A and B represent the results from all hits above the threshold for the analysed genes. C and D represent the mean values for the hits when summarised for each gene