-
Notifications
You must be signed in to change notification settings - Fork 9
Open
Description
Hello,
I am having an issue while running unicore:
unicore createdb -g 012OMARK/AllProteomes/ 013UNICORE/proteome_db prostt5/weights/
Using device: cuda:0
Loading T5 from: prostt5/weights/
You are using the default legacy behaviour of the <class 'transformers.models.t5.tokenization_t5.T5Tokenizer'>. This is expected, and simply means that the `legacy` (
previous) behavior will be used so nothing changes for you. If you want to use the new behaviour, set `legacy=False`. This should only be set if you understand what i
t means, and thoroughly read the reason why this was added as explained in https://github.com/huggingface/transformers/pull/24565
/home/jg2070/miniforge3/envs/unicore/etc/predict_3Di_encoderOnly.py:175: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default valu
e), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https:
//github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `T
rue`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are expli
citly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have ful
l control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
state = torch.load(checkpoint_p, map_location=device)
Using models in full-precision.
########################################
Example sequence: unicore_2880843
MQNNCFKIATLCMEPPKFDFEMVLERKRLKDKQKLLKQYRLLEGFVGPTVGTTVTGTNTDIGEADADGGPQEGTTAESDASTQETTEKFTVEEFKDLRRAEGVEDYDDYDFSGELTDDDYIEN
########################################
Total number of sequences: 4952781
Average sequence length: 454.02570131810796
Number of sequences >1000: 397046
0%| | 0/4952781 [00:00<?, ?it/s]
RuntimeError during embedding for unicore_2793106 (L=315894)
0%| | 1/4952781 [00:02<2867:45:30, 2.08s/it]
RuntimeError during embedding for unicore_2022362 (L=117904)
0%| | 2/4952781 [00:02<1664:13:29, 1.21s/it]
RuntimeError during embedding for unicore_263579 (L=113479)
0%| | 3/4952781 [00:02<1047:25:31, 1.31it/s]
Some details of the GPU I am using:
torch.cuda.get_device_name(0)
'NVIDIA A100-SXM4-80GB'
device = torch.device("cuda:0")
total_memory = torch.cuda.get_device_properties(device).total_memory
total_memory
84974239744
print(f"Total GPU Memory: {total_memory / (1024 ** 3):.2f} GB")
Total GPU Memory: 79.14 GB
Should I perhaps try to compile foldseek with CUDA and run instead with --use-foldseek?
All the best
Metadata
Metadata
Assignees
Labels
No labels