You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm currently trying to use an updated wespeaker voice model like the one shown in the picture, but when I follow the file pyannote/audio/models/embedding/wespeaker/convert.py I can't adapt it, it shows the following error, how do I change ?
#1772
Open
LiLiWangzz opened this issue
Oct 13, 2024
· 1 comment
Hi, I'm currently trying to use an updated wespeaker voice model like the one shown in the picture, but when I follow the file pyannote/audio/models/embedding/wespeaker/convert.py I can't adapt it, it shows the following error, how do I change ?
Hey @LiLiWangzz, pyannote/audio/models/embedding/wespeaker/convert.py is not dedicated to that. Furthermore, SimAMResNetxx is not currently supported by pyannote, but feel free to open a pull request.
@hbredin
raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for ResNet:
Missing key(s) in state_dict: "conv1.weight", "bn1.weight", "bn1.bias", "bn1.running_mean", "bn1.running_var", "layer1.0.conv1.weight", "layer1.0.bn1.weight", "layer1.0.bn1.bias", "layer1.0.bn1.running_mean", "layer1.0.bn1.running_var", "layer1.0.conv2.weight", "layer1.0.bn2.weight", "layer1.0.bn2.bias", "layer1.0.bn2.running_mean", "layer1.0.bn2.running_var", "layer1.1.conv1.weight", "layer1.1.bn1.weight", "layer1.1.bn1.bias", "layer1.1.bn1.running_mean", "layer1.1.bn1.running_var", "layer1.1.conv2.weight", "layer1.1.bn2.weight", "layer1.1.bn2.bias", "layer1.1.bn2.running_mean", "layer1.1.bn2.running_var", "layer1.2.conv1.weight", "layer1.2.bn1.weight", "layer1.2.bn1.bias", "layer1.2.bn1.running_mean", "layer1.2.bn1.running_var", "layer1.2.conv2.weight", "layer1.2.bn2.weight", "layer1.2.bn2.bias", "layer1.2.bn2.running_mean", "layer1.2.bn2.running_var", "layer2.0.conv1.weight", "layer2.0.bn1.weight", "layer2.0.bn1.bias", "layer2.0.bn1.running_mean", "layer2.0.bn1.running_var", "layer2.0.conv2.weight", "layer2.0.bn2.weight", "layer2.0.bn2.bias", "layer2.0.bn2.running_mean", "layer2.0.bn2.running_var", "layer2.0.shortcut.0.weight", "layer2.0.shortcut.1.weight", "layer2.0.shortcut.1.bias", "layer2.0.shortcut.1.running_mean", "layer2.0.shortcut.1.running_var", "layer2.1.conv1.weight", "layer2.1.bn1.weight", "layer2.1.bn1.bias", "layer2.1.bn1.running_mean", "layer2.1.bn1.running_var", "layer2.1.conv2.weight", "layer2.1.bn2.weight", "layer2.1.bn2.bias", "layer2.1.bn2.running_mean", "layer2.1.bn2.running_var", "layer2.2.conv1.weight", "layer2.2.bn1.weight", "layer2.2.bn1.bias", "layer2.2.bn1.running_mean", "layer2.2.bn1.running_var", "layer2.2.conv2.weight", "layer2.2.bn2.weight", "layer2.2.bn2.bias", "layer2.2.bn2.running_mean", "layer2.2.bn2.running_var", "layer2.3.conv1.weight", "layer2.3.bn1.weight", "layer2.3.bn1.bias", "layer2.3.bn1.running_mean", "layer2.3.bn1.running_var", "layer2.3.conv2.weight", "layer2.3.bn2.weight", "layer2.3.bn2.bias", "layer2.3.bn2.running_mean", "layer2.3.bn2.running_var", "layer3.0.conv1.weight", "layer3.0.bn1.weight", "layer3.0.bn1.bias", "layer3.0.bn1.running_mean", "layer3.0.bn1.running_var", "layer3.0.conv2.weight", "layer3.0.bn2.weight", "layer3.0.bn2.bias", "layer3.0.bn2.running_mean", "layer3.0.bn2.running_var", "layer3.0.shortcut.0.weight", "layer3.0.shortcut.1.weight", "layer3.0.shortcut.1.bias", "layer3.0.shortcut.1.running_mean", "layer3.0.shortcut.1.running_var", "layer3.1.conv1.weight", "layer3.1.bn1.weight", "layer3.1.bn1.bias", "layer3.1.bn1.running_mean", "layer3.1.bn1.running_var", "layer3.1.conv2.weight", "layer3.1.bn2.weight", "layer3.1.bn2.bias", "layer3.1.bn2.running_mean", "layer3.1.bn2.running_var", "layer3.2.conv1.weight", "layer3.2.bn1.weight", "layer3.2.bn1.bias", "layer3.2.bn1.running_mean", "layer3.2.bn1.running_var", "layer3.2.conv2.weight", "layer3.2.bn2.weight", "layer3.2.bn2.bias", "layer3.2.bn2.running_mean", "layer3.2.bn2.running_var", "layer3.3.conv1.weight", "layer3.3.bn1.weight", "layer3.3.bn1.bias", "layer3.3.bn1.running_mean", "layer3.3.bn1.running_var", "layer3.3.conv2.weight", "layer3.3.bn2.weight", "layer3.3.bn2.bias", "layer3.3.bn2.running_mean", "layer3.3.bn2.running_var", "layer3.4.conv1.weight", "layer3.4.bn1.weight", "layer3.4.bn1.bias", "layer3.4.bn1.running_mean", "layer3.4.bn1.running_var", "layer3.4.conv2.weight", "layer3.4.bn2.weight", "layer3.4.bn2.bias", "layer3.4.bn2.running_mean", "layer3.4.bn2.running_var", "layer3.5.conv1.weight", "layer3.5.bn1.weight", "layer3.5.bn1.bias", "layer3.5.bn1.running_mean", "layer3.5.bn1.running_var", "layer3.5.conv2.weight", "layer3.5.bn2.weight", "layer3.5.bn2.bias", "layer3.5.bn2.running_mean", "layer3.5.bn2.running_var", "layer4.0.conv1.weight", "layer4.0.bn1.weight", "layer4.0.bn1.bias", "layer4.0.bn1.running_mean", "layer4.0.bn1.running_var", "layer4.0.conv2.weight", "layer4.0.bn2.weight", "layer4.0.bn2.bias", "layer4.0.bn2.running_mean", "layer4.0.bn2.running_var", "layer4.0.shortcut.0.weight", "layer4.0.shortcut.1.weight", "layer4.0.shortcut.1.bias", "layer4.0.shortcut.1.running_mean", "layer4.0.shortcut.1.running_var", "layer4.1.conv1.weight", "layer4.1.bn1.weight", "layer4.1.bn1.bias", "layer4.1.bn1.running_mean", "layer4.1.bn1.running_var", "layer4.1.conv2.weight", "layer4.1.bn2.weight", "layer4.1.bn2.bias", "layer4.1.bn2.running_mean", "layer4.1.bn2.running_var", "layer4.2.conv1.weight", "layer4.2.bn1.weight", "layer4.2.bn1.bias", "layer4.2.bn1.running_mean", "layer4.2.bn1.running_var", "layer4.2.conv2.weight", "layer4.2.bn2.weight", "layer4.2.bn2.bias", "layer4.2.bn2.running_mean", "layer4.2.bn2.running_var", "seg_1.weight", "seg_1.bias".
Unexpected key(s) in state_dict: "front.conv1.weight", "front.bn1.weight", "front.bn1.bias", "front.bn1.running_mean", "front.bn1.running_var", "front.bn1.num_batches_tracked", "front.layer1.0.conv1.weight", "front.layer1.0.bn1.weight", "front.layer1.0.bn1.bias", "front.layer1.0.bn1.running_mean", "front.layer1.0.bn1.running_var", "front.layer1.0.bn1.num_batches_tracked", "front.layer1.0.conv2.weight", "front.layer1.0.bn2.weight", "front.layer1.0.bn2.bias", "front.layer1.0.bn2.running_mean", "front.layer1.0.bn2.running_var", "front.layer1.0.bn2.num_batches_tracked", "front.layer1.1.conv1.weight", "front.layer1.1.bn1.weight", "front.layer1.1.bn1.bias", "front.layer1.1.bn1.running_mean", "front.layer1.1.bn1.running_var", "front.layer1.1.bn1.num_batches_tracked", "front.layer1.1.conv2.weight", "front.layer1.1.bn2.weight", "front.layer1.1.bn2.bias", "front.layer1.1.bn2.running_mean", "front.layer1.1.bn2.running_var", "front.layer1.1.bn2.num_batches_tracked", "front.layer1.2.conv1.weight", "front.layer1.2.bn1.weight", "front.layer1.2.bn1.bias", "front.layer1.2.bn1.running_mean", "front.layer1.2.bn1.running_var", "front.layer1.2.bn1.num_batches_tracked", "front.layer1.2.conv2.weight", "front.layer1.2.bn2.weight", "front.layer1.2.bn2.bias", "front.layer1.2.bn2.running_mean", "front.layer1.2.bn2.running_var", "front.layer1.2.bn2.num_batches_tracked", "front.layer2.0.conv1.weight", "front.layer2.0.bn1.weight", "front.layer2.0.bn1.bias", "front.layer2.0.bn1.running_mean", "front.layer2.0.bn1.running_var", "front.layer2.0.bn1.num_batches_tracked", "front.layer2.0.conv2.weight", "front.layer2.0.bn2.weight", "front.layer2.0.bn2.bias", "front.layer2.0.bn2.running_mean", "front.layer2.0.bn2.running_var", "front.layer2.0.bn2.num_batches_tracked", "front.layer2.0.downsample.0.weight", "front.layer2.0.downsample.1.weight", "front.layer2.0.downsample.1.bias", "front.layer2.0.downsample.1.running_mean", "front.layer2.0.downsample.1.running_var", "front.layer2.0.downsample.1.num_batches_tracked", "front.layer2.1.conv1.weight", "front.layer2.1.bn1.weight", "front.layer2.1.bn1.bias", "front.layer2.1.bn1.running_mean", "front.layer2.1.bn1.running_var", "front.layer2.1.bn1.num_batches_tracked", "front.layer2.1.conv2.weight", "front.layer2.1.bn2.weight", "front.layer2.1.bn2.bias", "front.layer2.1.bn2.running_mean", "front.layer2.1.bn2.running_var", "front.layer2.1.bn2.num_batches_tracked", "front.layer2.2.conv1.weight", "front.layer2.2.bn1.weight", "front.layer2.2.bn1.bias", "front.layer2.2.bn1.running_mean", "front.layer2.2.bn1.running_var", "front.layer2.2.bn1.num_batches_tracked", "front.layer2.2.conv2.weight", "front.layer2.2.bn2.weight", "front.layer2.2.bn2.bias", "front.layer2.2.bn2.running_mean", "front.layer2.2.bn2.running_var", "front.layer2.2.bn2.num_batches_tracked", "front.layer2.3.conv1.weight", "front.layer2.3.bn1.weight", "front.layer2.3.bn1.bias", "front.layer2.3.bn1.running_mean", "front.layer2.3.bn1.running_var", "front.layer2.3.bn1.num_batches_tracked", "front.layer2.3.conv2.weight", "front.layer2.3.bn2.weight", "front.layer2.3.bn2.bias", "front.layer2.3.bn2.running_mean", "front.layer2.3.bn2.running_var", "front.layer2.3.bn2.num_batches_tracked", "front.layer3.0.conv1.weight", "front.layer3.0.bn1.weight", "front.layer3.0.bn1.bias", "front.layer3.0.bn1.running_mean", "front.layer3.0.bn1.running_var", "front.layer3.0.bn1.num_batches_tracked", "front.layer3.0.conv2.weight", "front.layer3.0.bn2.weight", "front.layer3.0.bn2.bias", "front.layer3.0.bn2.running_mean", "front.layer3.0.bn2.running_var", "front.layer3.0.bn2.num_batches_tracked", "front.layer3.0.downsample.0.weight", "front.layer3.0.downsample.1.weight", "front.layer3.0.downsample.1.bias", "front.layer3.0.downsample.1.running_mean", "front.layer3.0.downsample.1.running_var", "front.layer3.0.downsample.1.num_batches_tracked", "front.layer3.1.conv1.weight", "front.layer3.1.bn1.weight", "front.layer3.1.bn1.bias", "front.layer3.1.bn1.running_mean", "front.layer3.1.bn1.running_var", "front.layer3.1.bn1.num_batches_tracked", "front.layer3.1.conv2.weight", "front.layer3.1.bn2.weight", "front.layer3.1.bn2.bias", "front.layer3.1.bn2.running_mean", "front.layer3.1.bn2.running_var", "front.layer3.1.bn2.num_batches_tracked", "front.layer3.2.conv1.weight", "front.layer3.2.bn1.weight", "front.layer3.2.bn1.bias", "front.layer3.2.bn1.running_mean", "front.layer3.2.bn1.running_var", "front.layer3.2.bn1.num_batches_tracked", "front.layer3.2.conv2.weight", "front.layer3.2.bn2.weight", "front.layer3.2.bn2.bias", "front.layer3.2.bn2.running_mean", "front.layer3.2.bn2.running_var", "front.layer3.2.bn2.num_batches_tracked", "front.layer3.3.conv1.weight", "front.layer3.3.bn1.weight", "front.layer3.3.bn1.bias", "front.layer3.3.bn1.running_mean", "front.layer3.3.bn1.running_var", "front.layer3.3.bn1.num_batches_tracked", "front.layer3.3.conv2.weight", "front.layer3.3.bn2.weight", "front.layer3.3.bn2.bias", "front.layer3.3.bn2.running_mean", "front.layer3.3.bn2.running_var", "front.layer3.3.bn2.num_batches_tracked", "front.layer3.4.conv1.weight", "front.layer3.4.bn1.weight", "front.layer3.4.bn1.bias", "front.layer3.4.bn1.running_mean", "front.layer3.4.bn1.running_var", "front.layer3.4.bn1.num_batches_tracked", "front.layer3.4.conv2.weight", "front.layer3.4.bn2.weight", "front.layer3.4.bn2.bias", "front.layer3.4.bn2.running_mean", "front.layer3.4.bn2.running_var", "front.layer3.4.bn2.num_batches_tracked", "front.layer3.5.conv1.weight", "front.layer3.5.bn1.weight", "front.layer3.5.bn1.bias", "front.layer3.5.bn1.running_mean", "front.layer3.5.bn1.running_var", "front.layer3.5.bn1.num_batches_tracked", "front.layer3.5.conv2.weight", "front.layer3.5.bn2.weight", "front.layer3.5.bn2.bias", "front.layer3.5.bn2.running_mean", "front.layer3.5.bn2.running_var", "front.layer3.5.bn2.num_batches_tracked", "front.layer4.0.conv1.weight", "front.layer4.0.bn1.weight", "front.layer4.0.bn1.bias", "front.layer4.0.bn1.running_mean", "front.layer4.0.bn1.running_var", "front.layer4.0.bn1.num_batches_tracked", "front.layer4.0.conv2.weight", "front.layer4.0.bn2.weight", "front.layer4.0.bn2.bias", "front.layer4.0.bn2.running_mean", "front.layer4.0.bn2.running_var", "front.layer4.0.bn2.num_batches_tracked", "front.layer4.0.downsample.0.weight", "front.layer4.0.downsample.1.weight", "front.layer4.0.downsample.1.bias", "front.layer4.0.downsample.1.running_mean", "front.layer4.0.downsample.1.running_var", "front.layer4.0.downsample.1.num_batches_tracked", "front.layer4.1.conv1.weight", "front.layer4.1.bn1.weight", "front.layer4.1.bn1.bias", "front.layer4.1.bn1.running_mean", "front.layer4.1.bn1.running_var", "front.layer4.1.bn1.num_batches_tracked", "front.layer4.1.conv2.weight", "front.layer4.1.bn2.weight", "front.layer4.1.bn2.bias", "front.layer4.1.bn2.running_mean", "front.layer4.1.bn2.running_var", "front.layer4.1.bn2.num_batches_tracked", "front.layer4.2.conv1.weight", "front.layer4.2.bn1.weight", "front.layer4.2.bn1.bias", "front.layer4.2.bn1.running_mean", "front.layer4.2.bn1.running_var", "front.layer4.2.bn1.num_batches_tracked", "front.layer4.2.conv2.weight", "front.layer4.2.bn2.weight", "front.layer4.2.bn2.bias", "front.layer4.2.bn2.running_mean", "front.layer4.2.bn2.running_var", "front.layer4.2.bn2.num_batches_tracked", "pooling.attention.0.weight", "pooling.attention.0.bias", "pooling.attention.2.weight", "pooling.attention.2.bias", "pooling.attention.2.running_mean", "pooling.attention.2.running_var", "pooling.attention.2.num_batches_tracked", "pooling.attention.3.weight", "pooling.attention.3.bias", "bottleneck.weight", "bottleneck.bias".
Originally posted by @LiLiWangzz in #1590 (comment)
The text was updated successfully, but these errors were encountered: