Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit aae01a2

Browse files
authored
Merge pull request #356 from ruvnet/fix/large-dataset-training
fix: skip triplet JSON export for large datasets (>100K)
2 parents 21fd7c8 + 828d059 commit aae01a2

1 file changed

Lines changed: 7 additions & 3 deletions

File tree

scripts/train-ruvllm.js

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1257,9 +1257,13 @@ async function main() {
12571257
contrastiveResult.finalLoss = finalContrastiveLoss;
12581258
contrastiveResult.improvement = contrastiveImprovement;
12591259

1260-
// Export contrastive training data
1261-
const contrastiveOutDir = contrastiveTrainer.exportTrainingData();
1262-
console.log(` Training data exported to: ${contrastiveOutDir}`);
1260+
// Export contrastive training data (skip for large datasets to avoid JSON string limit)
1261+
if (contrastiveTrainer.getTripletCount() < 100000) {
1262+
const contrastiveOutDir = contrastiveTrainer.exportTrainingData();
1263+
console.log(` Training data exported to: ${contrastiveOutDir}`);
1264+
} else {
1265+
console.log(` Skipping triplet export (${contrastiveTrainer.getTripletCount()} triplets too large for JSON)`);
1266+
}
12631267

12641268
// -----------------------------------------------------------------------
12651269
// Phase 2: Task head training via TrainingPipeline

0 commit comments

Comments
 (0)