Molly is a Large Language Model composed of multiple encoders, capable of understanding multi-omics data (DNA, RNA, and protein).
Molly 是一个集成了多个 encoder 的大语言模型,能够理解 DNA,RNA 和 protein 序列信息。
Omics-Specific Models(OSMs)指代各自组学赛道中性能领先的专用模型;Enc-Head 则是“组学 Encoder + 分类头”的简洁架构,将预训练编码器与任务相关分类头直接连接。
- Base Model: Enhanced Qwen3 with nucleotide-transformer and ESM-2 encoders
- Optimization: Support Liger-Kernel and FlashAttention for 100% training speedup, see example script
```bash
./scripts/infer/inference_nt_lora.sh
```
-
Hotfix transformers source code
## transformers/modeling_utils.py ## add 4 lines if not model._tp_plan: model_tp_plan = {} else: model_tp_plan = model._tp_plan ## old code tp_plan_regex = ( re.compile("|".join([re.escape(plan) for plan in model_tp_plan])) if _torch_distributed_available and torch.distributed.is_initialized() else None )
-
Run training script
swanlab login ./scripts/train/run_train.sh # or for test ./scripts/train/run_train_mini.sh -
Fix qwen3_8B + deepspeed training stuck
Open
/usr/local/lib/python3.10/dist-packages/deepspeed/runtime/bf16_optimizer.py294 if all_groups_norm <= 0.: 299 dist.barrier() 300 301 if self.clip_grad > 0.: 302 clip_tensors_by_global_norm(input_tensors=self.get_grads_for_norm(for_clipping=True), 303 max_norm=self.clip_grad, 304 global_norm=all_groups_norm, 305 mpu=self.mpu, 306 use_graph=self.graph_harvesting)
This project follows apache license.