-
Notifications
You must be signed in to change notification settings - Fork 38
Open
Description
Morning,
I used your notebook Speech Emotion Recognition (Wav2Vec 2.0) with another dataset and I got an error during the training...
Could you help me please, the code and error are just below .
from transformers import TrainingArguments
training_args = TrainingArguments(
output_dir=finetune_output_dir,
per_device_train_batch_size=4,
per_device_eval_batch_size=4,
evaluation_strategy="steps", #"epoch"
gradient_accumulation_steps=1,
num_train_epochs=50,
fp16=True,
save_steps= 10, #n_steps,
eval_steps= 10, #n_steps,
logging_steps=10,
learning_rate=1e-4,
save_total_limit=10,
)
trainer = CTCTrainer(
model=model,
data_collator=data_collator,
args=training_args,
compute_metrics=compute_metrics,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
tokenizer=processor.feature_extractor,
)
trainer.train()The following columns in the training set don't have a corresponding argument in `Wav2Vec2ForSpeechClassification.forward` and have been ignored: language, audio_name, path. ***** Running training ***** Num examples = 10769 Num Epochs = 50 Instantaneous batch size per device = 4 Total train batch size (w. parallel, distributed & accumulation) = 4 Gradient Accumulation steps = 1 Total optimization steps = 134650 /anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/torch/_tensor.py:575: UserWarning: floor_divide is deprecated, and will be removed in a future version of pytorch. It currently rounds toward 0 (like the 'trunc' function NOT 'floor'). This results in incorrect rounding for negative values. To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at /opt/conda/conda-bld/pytorch_1623448278899/work/aten/src/ATen/native/BinaryOps.cpp:467.) return torch.floor_divide(self, other)
[ 142/134650 1:17:58 < 1248:33:33, 0.03 it/s, Epoch 0.05/50]
| Step | Training Loss | Validation Loss | Accuracy |
|---|---|---|---|
| 10 | 0.698400 | 0.497485 | 0.813416 |
| 20 | 0.394700 | 0.291701 | 0.913778 |
| 30 | 0.225200 | 0.138921 | 0.951371 |
| 40 | 0.389500 | 0.137598 | 0.962752 |
| 50 | 0.373600 | 0.469463 | 0.878255 |
| 60 | 0.079500 | 0.144742 | 0.972237 |
| 70 | 0.213000 | 0.185833 | 0.969822 |
| 80 | 0.046400 | 0.295700 | 0.947405 |
| 90 | 0.003300 | 0.149647 | 0.979134 |
| 100 | 0.000800 | 0.124717 | 0.978617 |
| 110 | 0.313800 | 0.237750 | 0.958441 |
| 120 | 0.251000 | 0.166465 | 0.965166 |
| 130 | 0.032900 | 0.044269 | 0.989826 |
| 140 | 0.051600 | 0.061006 | 0.989826 |
Attempted to log scalar metric loss: 0.6984 Attempted to log scalar metric learning_rate: 9.999257333828444e-05 Attempted to log scalar metric epoch: 0.0
The following columns in the evaluation set don't have a corresponding argument in `Wav2Vec2ForSpeechClassification.forward` and have been ignored: language, audio_name, path. ***** Running Evaluation ***** Num examples = 5799 Batch size = 4
Attempted to log scalar metric eval_loss: 0.4974852204322815 Attempted to log scalar metric eval_accuracy: 0.8134161233901978 Attempted to log scalar metric eval_runtime: 296.3331 Attempted to log scalar metric eval_samples_per_second: 19.569 Attempted to log scalar metric eval_steps_per_second: 4.893 Attempted to log scalar metric epoch: 0.0
Saving model checkpoint to MODEL/wav2vec2-xlsr-speech-emotion-recognition_dropout-0.5_3/checkpoint-10 Configuration saved in MODEL/wav2vec2-xlsr-speech-emotion-recognition_dropout-0.5_3/checkpoint-10/config.json Model weights saved in MODEL/wav2vec2-xlsr-speech-emotion-recognition_dropout-0.5_3/checkpoint-10/pytorch_model.bin Configuration saved in MODEL/wav2vec2-xlsr-speech-emotion-recognition_dropout-0.5_3/checkpoint-10/preprocessor_config.json
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-32-3435b262f1ae> in <module>
----> 1 trainer.train()
/anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/transformers/trainer.py in train(self, resume_from_checkpoint, trial, ignore_keys_for_eval, **kwargs)
1330 tr_loss_step = self.training_step(model, inputs)
1331 else:
-> 1332 tr_loss_step = self.training_step(model, inputs)
1333
1334 if (
<ipython-input-29-878b4353167f> in training_step(self, model, inputs)
43 if self.use_amp:
44 with autocast():
---> 45 loss = self.compute_loss(model, inputs)
46 else:
47 loss = self.compute_loss(model, inputs)
/anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/transformers/trainer.py in compute_loss(self, model, inputs, return_outputs)
1921 else:
1922 labels = None
-> 1923 outputs = model(**inputs)
1924 # Save past state if it exists
1925 # TODO: this needs to be fixed and made cleaner later.
/anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
<ipython-input-16-dd9fe3ea0f13> in forward(self, input_values, attention_mask, output_attentions, output_hidden_states, return_dict, labels)
70 ):
71 return_dict = return_dict if return_dict is not None else self.config.use_return_dict
---> 72 outputs = self.wav2vec2(
73 input_values,
74 attention_mask=attention_mask,
/anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/torch/nn/modules/module.py in _call_impl(self, *input, **kwargs)
1049 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
1050 or _global_forward_hooks or _global_forward_pre_hooks):
-> 1051 return forward_call(*input, **kwargs)
1052 # Do not call functions when jit is used
1053 full_backward_hooks, non_full_backward_hooks = [], []
/anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py in forward(self, input_values, attention_mask, mask_time_indices, output_attentions, output_hidden_states, return_dict)
1285
1286 hidden_states, extract_features = self.feature_projection(extract_features)
-> 1287 hidden_states = self._mask_hidden_states(
1288 hidden_states, mask_time_indices=mask_time_indices, attention_mask=attention_mask
1289 )
/anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py in _mask_hidden_states(self, hidden_states, mask_time_indices, attention_mask)
1228 hidden_states[mask_time_indices] = self.masked_spec_embed.to(hidden_states.dtype)
1229 elif self.config.mask_time_prob > 0 and self.training:
-> 1230 mask_time_indices = _compute_mask_indices(
1231 (batch_size, sequence_length),
1232 mask_prob=self.config.mask_time_prob,
/anaconda/envs/azureml_py38_pytorch/lib/python3.8/site-packages/transformers/models/wav2vec2/modeling_wav2vec2.py in _compute_mask_indices(shape, mask_prob, mask_length, attention_mask, min_masks)
240
241 # get random indices to mask
--> 242 spec_aug_mask_idx = np.random.choice(
243 np.arange(input_length - (mask_length - 1)), num_masked_span, replace=False
244 )
mtrand.pyx in numpy.random.mtrand.RandomState.choice()
ValueError: Cannot take a larger sample than population when 'replace=False'Metadata
Metadata
Assignees
Labels
No labels