Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[clap] Add clap#6028

Merged
w5688414 merged 13 commits into
PaddlePaddle:developfrom
w5688414:pp1
Jun 1, 2023
Merged

[clap] Add clap#6028
w5688414 merged 13 commits into
PaddlePaddle:developfrom
w5688414:pp1

Conversation

@w5688414

@w5688414 w5688414 commented May 25, 2023

Copy link
Copy Markdown
Contributor

PR types

  • New features

PR changes

  • Models

Description

TODO:

  • Add unit test
  • paddle
from datasets import load_dataset
from matplotlib.pyplot import axis
from paddlenlp.transformers import ClapProcessor, ClapModel

import paddle.nn.functional as F

dataset = load_dataset("ashraq/esc50","train")
# dataset.save_to_disk("ashraq.hf")
audio_sample = dataset["train"]["audio"][0]["array"]
model = ClapModel.from_pretrained("laion/clap-htsat-unfused")
processor = ClapProcessor.from_pretrained("laion/clap-htsat-unfused")
input_text = ["Sound of a dog", "Sound of vaccum cleaner"]
inputs = processor(text=input_text, audios=audio_sample, return_tensors="pd",return_attention_mask=True,return_token_type_ids=False, padding=True)
# print(inputs)
model.eval()
outputs = model(**inputs,return_dict=True)
logits_per_audio = outputs.logits_per_audio 
probs = F.softmax(logits_per_audio, axis=-1)
print(probs)
Tensor(shape=[1, 2], dtype=float32, place=Place(gpu:0), stop_gradient=False,
       [[0.99429250, 0.00570753]])
  • torch
from datasets import load_dataset
from transformers import AutoProcessor, ClapModel
dataset = load_dataset("ashraq/esc50")
audio_sample = dataset["train"]["audio"][0]["array"]
model = ClapModel.from_pretrained("laion/clap-htsat-unfused")
processor = AutoProcessor.from_pretrained("laion/clap-htsat-unfused")
input_text = ["Sound of a dog", "Sound of vaccum cleaner"]
inputs = processor(text=input_text, audios=audio_sample, return_tensors="pt", padding=True)
model = model.eval()
print(inputs)
outputs = model(**inputs)
logits_per_audio = outputs.logits_per_audio 
probs = logits_per_audio.softmax(dim=-1)
print(probs)
tensor([[0.9943, 0.0057]], grad_fn=<SoftmaxBackward0>)

@paddle-bot

paddle-bot Bot commented May 25, 2023

Copy link
Copy Markdown

Thanks for your contribution!

@w5688414 w5688414 requested review from JunnYu and guoshengCS May 25, 2023 13:09
@codecov

codecov Bot commented May 25, 2023

Copy link
Copy Markdown

Codecov Report

Merging #6028 (c71ecc2) into develop (3a62bf0) will increase coverage by 0.44%.
The diff coverage is 78.76%.

@@             Coverage Diff             @@
##           develop    #6028      +/-   ##
===========================================
+ Coverage    62.52%   62.97%   +0.44%     
===========================================
  Files          492      505      +13     
  Lines        69317    70949    +1632     
===========================================
+ Hits         43342    44680    +1338     
- Misses       25975    26269     +294     
Impacted Files Coverage Δ
paddlenlp/transformers/auto/processing.py 67.08% <ø> (ø)
paddlenlp/transformers/clap/modeling.py 78.57% <ø> (ø)
paddlenlp/transformers/audio_utils.py 56.92% <56.92%> (ø)
.../transformers/feature_extraction_sequence_utils.py 82.05% <82.05%> (ø)
paddlenlp/transformers/clap/processing.py 87.87% <87.87%> (ø)
paddlenlp/transformers/clap/feature_extraction.py 92.56% <92.56%> (ø)
paddlenlp/transformers/clap/configuration.py 94.69% <94.69%> (ø)
paddlenlp/transformers/__init__.py 100.00% <100.00%> (ø)
paddlenlp/transformers/blip_2/configuration.py 91.34% <100.00%> (+0.08%) ⬆️

... and 25 files with indirect coverage changes

Comment thread paddlenlp/transformers/clap/modeling.py Outdated
self.local_att = nn.Sequential(
nn.Conv2D(channels, inter_channels, kernel_size=1, stride=1, padding=0),
nn.BatchNorm2D(inter_channels),
nn.ReLU(inplace=True),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

inplace没有

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

def _get_arguments_from_pretrained(cls, pretrained_model_name_or_path, **kwargs):
args = []
for attribute_name in cls.attributes:
# breakpoint()

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

删了

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Comment thread paddlenlp/transformers/clap/modeling.py Outdated
# Copied from paddlenlp.transformers.models.bert.modeling_bert.BertEmbeddings.__init__
def __init__(self, config):
super().__init__()
self.word_embeddings = nn.Embedding(config.vocab_size, config.hidden_size, padding_idx=config.pad_token_id)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

不需要指定padding index

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Comment thread paddlenlp/transformers/clap/modeling.py Outdated
# End copy
self.padding_idx = config.pad_token_id
self.position_embeddings = nn.Embedding(
config.max_position_embeddings, config.hidden_size, padding_idx=self.padding_idx

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

也是

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Comment thread paddlenlp/transformers/clap/modeling.py Outdated
if token_type_ids is None:
if hasattr(self, "token_type_ids"):
buffered_token_type_ids = self.token_type_ids[:, :seq_length]
buffered_token_type_ids_expanded = buffered_token_type_ids.expand(input_shape[0], seq_length)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

expand,少了括号(a,b)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

self.projection_hidden_act = projection_hidden_act

@classmethod
def from_pretrained(cls, pretrained_model_name_or_path: Union[str, os.PathLike], **kwargs) -> "PretrainedConfig":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@classmethod
def from_pretrained(
    cls,
    pretrained_model_name_or_path: Union[str, os.PathLike],
    from_hf_hub: bool = False,
    cache_dir: Optional[str] = None,
    **kwargs
) -> PretrainedConfig:
    kwargs.update({"from_hf_hub": from_hf_hub, "cache_dir": cache_dir})

这样

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

self.projection_dim = projection_dim

@classmethod
def from_pretrained(cls, pretrained_model_name_or_path: Union[str, os.PathLike], **kwargs) -> "PretrainedConfig":

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

同下

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Comment thread paddlenlp/transformers/clap/modeling.py Outdated
The ratio of the length of the output to the length of the input.
"""
(batch_size, time_length, classes_num) = hidden_states.shape
upsampled = hidden_states[:, :, None, :].repeat(1, 1, ratio, 1)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

repeat->tile([])

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Comment thread paddlenlp/transformers/clap/modeling.py Outdated

random_tensor = keep_prob + paddle.rand(shape, dtype=hidden_states.dtype)
random_tensor.floor_() # binarize
output = hidden_states.div(keep_prob) * random_tensor

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

div没有吧

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(a / b)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Comment thread paddlenlp/transformers/clap/modeling.py Outdated

config_class = ClapConfig
base_model_prefix = "clap"
supports_gradient_checkpointing = False

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

supports_gradient_checkpointing = True

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

Comment thread paddlenlp/transformers/clap/modeling.py Outdated

>>> outputs = model(**inputs)
>>> logits_per_audio = outputs.logits_per_audio # this is the audio-text similarity score
>>> probs = logits_per_audio.softmax(dim=-1) # we can take the softmax to get the label probabilities

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

修改调用

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

已修改

@w5688414 w5688414 self-assigned this May 31, 2023
@w5688414 w5688414 merged commit b927cf7 into PaddlePaddle:develop Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants