本项目为音频分类入门教程,基于Paddle API展开。先讲解音频基础知识,包括本质、三要素、格式及处理概念;再介绍短时傅里叶变换和LogFBank等特征提取方法;最后用《双城之战第二季》中杰斯、金克斯、狼母的音频数据,构建LSTM模型完成分类,含数据加载、模型训练与测试,测试效果良好。
☞☞☞AI 智能聊天, 问答助手, AI 智能搜索, 免费无限量使用 DeepSeek R1 模型☜☜☜
![[语音分类入门]基于paddlespeech和lstm网络的双城之战人物语音分类 - php中文网](https://img.php.cn/upload/article/202507/31/2025073109422430796.jpg)
对于一段音频,一般会将整段音频进行分帧,每一帧含有一定长度的信号数据,一般使用 25ms,帧与帧之间的移动距离称为帧移,一般使用 10ms,然后对每一帧的信号数据加窗后,进行离散傅立叶变换(DFT)得到频谱图。
通过按照上面的对一段音频进行分帧后,我们可以用傅里叶变换来分析每一帧信号的频率特性。将每一帧的频率信息拼接后,可以获得该音频不同时刻的频率特征——Spectrogram,也称作为语谱图。
下面例子采用 paddle.signal.stft 演示如何提取示例音频的频谱特征,并进行可视化:
!pip install paddlespeech==1.2.0 # 安装paddlespeech及相关依赖!pip install paddleaudio==1.0.1!pip install typeguard==2.13.3
import paddleimport numpy as npfrom paddleaudio import loadimport matplotlib.pyplot as plt
data, sr = load(file='/home/aistudio/Arcane_3class/test/杰斯_audio60.wav', sr=32000, mono=True, dtype='float32')
x = paddle.to_tensor(data)
n_fft = 1024win_length = 1024hop_length = 320# [D, T]spectrogram = paddle.signal.stft(x, n_fft=n_fft, win_length=win_length, hop_length=512, onesided=True)
print('spectrogram.shape: {}'.format(spectrogram.shape))print('spectrogram.dtype: {}'.format(spectrogram.dtype))
spec = np.log(np.abs(spectrogram.numpy())**2)
plt.figure()
plt.title("Log Power Spectrogram")
plt.imshow(spec[:100, :], origin='lower')
plt.show()/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/utils/cpp_extension/extension_utils.py:686: UserWarning: No ccache found. Please be aware that recompiling all source files may be required. You can download and install ccache from: https://github.com/ccache/ccache/blob/master/doc/INSTALL.md warnings.warn(warning_message) W1228 14:41:44.374023 274 gpu_resources.cc:119] Please NOTE: device: 0, GPU Compute Capability: 7.0, Driver API Version: 12.0, Runtime API Version: 11.8 W1228 14:41:44.375391 274 gpu_resources.cc:164] device: 0, cuDNN Version: 8.9.
spectrogram.shape: [513, 88] spectrogram.dtype: paddle.complex64
<Figure size 640x480 with 1 Axes>
研究表明,人类对声音的感知是非线性的,随着声音频率的增加,人对更高频率的声音的区分度会不断下降。
例如同样是相差 500Hz 的频率,一般人可以轻松分辨出声音中 500Hz 和 1,000Hz 之间的差异,但是很难分辨出 10,000Hz 和 10,500Hz 之间的差异。
因此,学者提出了梅尔频率,在该频率计量方式下,人耳对相同数值的频率变化的感知程度是一样的。
关于梅尔频率的计算,其会对原始频率的低频的部分进行较多的采样,从而对应更多的频率,而对高频的声音进行较少的采样,从而对应较少的频率。使得人耳对梅尔频率的低频和高频的区分性一致。
Mel Fbank 的计算过程如下,而我们一般都是使用 LogFBank 作为识别特征:
下面例子采用 paddleaudio.features.LogMelSpectrogram 演示如何提取示例音频的 LogFBank:
from paddleaudio.features import LogMelSpectrogram
f_min=50.0f_max=14000.0# - sr: 音频文件的采样率。# - n_fft: FFT样本点个数。# - hop_length: 音频帧之间的间隔。# - win_length: 窗函数的长度。# - window: 窗函数种类。# - n_mels: 梅尔刻度数量。feature_extractor = LogMelSpectrogram(
sr=sr,
n_fft=n_fft,
hop_length=hop_length,
win_length=win_length,
window='hann',
f_min=f_min,
f_max=f_max,
n_mels=64)
x = paddle.to_tensor(data).unsqueeze(0) # [B, L]log_fbank = feature_extractor(x) # [B, D, T]log_fbank = log_fbank.squeeze(0) # [D, T]print('log_fbank.shape: {}'.format(log_fbank.shape))
plt.figure()
plt.imshow(log_fbank.numpy(), origin='lower')
plt.show()log_fbank.shape: [64, 141]
<Figure size 640x480 with 1 Axes>
!unzip /home/aistudio/data/data310325/Arcane_3class.zip -d ~ # 解压数据集
!tree -L 1 Arcane_3class/*/ # 数据集由train训练集和test测试集组成,其中训练集包括”杰斯“、”狼母“、”金克斯“三位核心动漫人物的音频数据
Arcane_3class/test/├── 杰斯_audio60.wav├── 杰斯_audio62.wav├── 杰斯_audio63.wav├── 狼母_audio43.wav├── 狼母_audio44.wav├── 狼母_audio45.wav├── 金克斯_audio42.wav├── 金克斯_audio44.wav└── 金克斯_audio45.wavArcane_3class/train/├── label.txt ├── 杰斯├── 狼母└── 金克斯3 directories, 10 files
# 初始化LogMelSpectrogram音频特征提取器import paddlefrom paddleaudio.features import LogMelSpectrogram
n_fft = 1024win_length = 1024hop_length = 320sr = 16000f_min=50.0f_max=14000.0# - sr: 音频文件的采样率。# - n_fft: FFT样本点个数。# - hop_length: 音频帧之间的间隔。# - win_length: 窗函数的长度。# - window: 窗函数种类。# - n_mels: 梅尔刻度数量。feature_extractor = LogMelSpectrogram(
sr=sr,
n_fft=n_fft,
hop_length=hop_length,
win_length=win_length,
window='hann',
f_min=f_min,
f_max=f_max,
n_mels=40)# 生成音频分类标注文件label.txtimport osimport glob
label_list = ["杰斯","金克斯","狼母"]with open("/home/aistudio/Arcane_3class/train/label.txt","w") as f:
audio_list = glob.glob("/home/aistudio/Arcane_3class/train/*/*.wav") for audio in audio_list:
audio_name = os.path.basename(audio)
labe_name = audio_name.split("_")[0]
label = label_list.index(labe_name) print("audio:",audio) print("label:",label)
f.write(f"{audio}\t{label}\n")
f.close()audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio14.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio18.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio20.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio41.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio46.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio59.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio24.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio27.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio4.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio7.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio8.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio42.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio44.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio58.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio33.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio34.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio54.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio36.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio56.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio9.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio10.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio6.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio28.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio29.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio38.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio45.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio49.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio50.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio47.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio5.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio1.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio12.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio2.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio22.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio3.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio48.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio31.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio57.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio13.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio25.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio26.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio30.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio32.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio23.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio53.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio55.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio16.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio21.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio35.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio40.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio11.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio15.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio19.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio43.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio17.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio37.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio39.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio51.wav label: 0 audio: /home/aistudio/Arcane_3class/train/杰斯/杰斯_audio52.wav label: 0 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio15.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio33.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio39.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio28.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio29.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio42.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio7.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio9.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio1.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio37.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio5.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio8.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio35.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio36.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio12.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio18.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio19.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio22.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio23.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio26.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio30.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio11.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio13.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio16.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio21.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio41.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio14.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio27.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio31.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio4.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio24.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio25.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio3.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio10.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio17.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio2.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio20.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio6.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio32.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio34.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio38.wav label: 2 audio: /home/aistudio/Arcane_3class/train/狼母/狼母_audio40.wav label: 2 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio11.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio12.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio18.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio26.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio41.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio7.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio22.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio32.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio33.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio1.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio16.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio17.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio34.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio14.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio2.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio25.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio40.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio8.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio15.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio29.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio30.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio36.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio39.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio21.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio31.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio37.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio6.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio9.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio10.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio23.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio27.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio3.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio35.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio4.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio5.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio13.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio19.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio20.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio24.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio28.wav label: 1 audio: /home/aistudio/Arcane_3class/train/金克斯/金克斯_audio38.wav label: 1
# 基于Paddle API 构建音频数据加载器import paddlefrom paddle.io import Datasetimport osfrom PIL import Imageimport numpy as npfrom paddleaudio import loadclass CustomDataset(Dataset):
def __init__(self,data,seq_len):
super(CustomDataset, self).__init__()
self.all_audio_list = data
self.seq_len = seq_len def __getitem__(self, index):
audio_data = self.all_audio_list[index][0]
data, sr = load(file=audio_data, mono=True, dtype='float32') # 单通道,float32音频样本点
x = paddle.to_tensor(data).unsqueeze(0) # [B, L]
log_fbank = feature_extractor(x) # [B, D, T]
log_fbank = log_fbank.squeeze(0) # [D, T]
if log_fbank.shape[1]<self.seq_len:
pad_value = 0.0
pad = [0, 0, 0, self.seq_len - log_fbank.shape[1]]
vector = paddle.nn.functional.pad(log_fbank, pad, mode='constant', value=pad_value) else:
vector = log_fbank[:, :self.seq_len]
vector = np.transpose(vector,(1,0))
label = self.all_audio_list[index][1]
label = np.array(label, dtype='int64') # 返回样本和标签
return vector, label def __len__(self):
return len(self.all_audio_list)all_audio_list = []with open("/home/aistudio/Arcane_3class/train/label.txt","r") as f: for line in f:
all_audio_list.append(line.split())# 定义模型结构(lstm+全连接层组成)import paddle.nn as nnclass MyLSTMModel2(nn.Layer):
def __init__(self):
super(MyLSTMModel2,self).__init__()
self.rnn = paddle.nn.LSTM(input_size = 40, hidden_size = 128, num_layers =1)
self.bn = paddle.nn.BatchNorm1D(128)
self.fc = nn.Sequential(
nn.Linear(128, 64),
nn.ReLU(),
nn.Linear(64,3),
) def forward(self,input): # forward 定义执行实际运行时网络的执行逻辑
out, (h, c)=self.rnn(input)
h = self.bn(h.squeeze(axis=0))
h= self.fc(h) return h# 实例化模型model = MyLSTMModel2()import paddleimport paddle.nn as nnimport paddle.optimizer as optimizerfrom paddle.vision.transforms import Compose, Resize, ToTensorfrom paddle.vision.datasets import MNISTfrom paddle.io import DataLoader# 设定超参数batch_size = 8learning_rate = 0.001epochs = 200# 实例化自定义数据集dataset = CustomDataset(data=all_audio_list,seq_len=64)# 创建数据加载器train_loader = paddle.io.DataLoader(dataset, batch_size=batch_size, shuffle=True)# 定义损失函数和优化器loss_fn = nn.CrossEntropyLoss()
optimizer = optimizer.Adam(parameters=model.parameters(), learning_rate=learning_rate)# 训练模型for epoch in range(epochs): for batch_id, data in enumerate(train_loader()):
audio,labels = data # 前向传播
preds = model(audio)
loss = loss_fn(preds, labels)
# 反向传播
loss.backward()
optimizer.step()
optimizer.clear_grad()
if batch_id % 10 == 0: print(f"Epoch [{epoch+1}/{epochs}], Step [{batch_id+1}/{len(train_loader)}], Loss: {loss.numpy()}")
paddle.save(model.state_dict(),"save_model/epoch_{}.pdparams".format(epoch))
paddle.save(optimizer.state_dict(),"save_model/epoch_{}.pdopt".format(epoch))/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/norm.py:818: UserWarning: When training, we now always track global mean and variance. warnings.warn(
Epoch [1/200], Step [1/18], Loss: 0.4725416898727417 Epoch [1/200], Step [11/18], Loss: 0.0440545529127121 Epoch [2/200], Step [1/18], Loss: 0.470591276884079 Epoch [2/200], Step [11/18], Loss: 0.09851406514644623 Epoch [3/200], Step [1/18], Loss: 0.08877842873334885 Epoch [3/200], Step [11/18], Loss: 0.0233285091817379 Epoch [4/200], Step [1/18], Loss: 0.009626179933547974 Epoch [4/200], Step [11/18], Loss: 0.019429411739110947 Epoch [5/200], Step [1/18], Loss: 0.2107754945755005 Epoch [5/200], Step [11/18], Loss: 0.23881441354751587 Epoch [6/200], Step [1/18], Loss: 0.1475854516029358 Epoch [6/200], Step [11/18], Loss: 0.07661047577857971 Epoch [7/200], Step [1/18], Loss: 0.04603996500372887 Epoch [7/200], Step [11/18], Loss: 0.0187629796564579 Epoch [8/200], Step [1/18], Loss: 0.0401885025203228 Epoch [8/200], Step [11/18], Loss: 0.03683056682348251 Epoch [9/200], Step [1/18], Loss: 0.021549057215452194 Epoch [9/200], Step [11/18], Loss: 0.048799484968185425 Epoch [10/200], Step [1/18], Loss: 0.06304647773504257 Epoch [10/200], Step [11/18], Loss: 0.09490914642810822 Epoch [11/200], Step [1/18], Loss: 0.04316396266222 Epoch [11/200], Step [11/18], Loss: 0.023982465267181396 Epoch [12/200], Step [1/18], Loss: 0.03854821249842644 Epoch [12/200], Step [11/18], Loss: 0.00421815924346447 Epoch [13/200], Step [1/18], Loss: 0.007414224557578564 Epoch [13/200], Step [11/18], Loss: 0.01622013747692108 Epoch [14/200], Step [1/18], Loss: 0.03377651050686836 Epoch [14/200], Step [11/18], Loss: 0.4106796383857727 Epoch [15/200], Step [1/18], Loss: 0.011543096974492073 Epoch [15/200], Step [11/18], Loss: 0.5009737014770508 Epoch [16/200], Step [1/18], Loss: 0.32573240995407104 Epoch [16/200], Step [11/18], Loss: 0.06828346848487854 Epoch [17/200], Step [1/18], Loss: 0.05403272435069084 Epoch [17/200], Step [11/18], Loss: 0.15716595947742462 Epoch [18/200], Step [1/18], Loss: 0.027571331709623337 Epoch [18/200], Step [11/18], Loss: 0.30799034237861633 Epoch [19/200], Step [1/18], Loss: 0.177289217710495 Epoch [19/200], Step [11/18], Loss: 0.09615731239318848 Epoch [20/200], Step [1/18], Loss: 0.09176662564277649 Epoch [20/200], Step [11/18], Loss: 0.048372216522693634 Epoch [21/200], Step [1/18], Loss: 0.021251996979117393 Epoch [21/200], Step [11/18], Loss: 0.023149274289608 Epoch [22/200], Step [1/18], Loss: 0.0194238293915987 Epoch [22/200], Step [11/18], Loss: 0.009129498153924942 Epoch [23/200], Step [1/18], Loss: 0.033666759729385376 Epoch [23/200], Step [11/18], Loss: 0.3238777816295624 Epoch [24/200], Step [1/18], Loss: 0.026807384565472603 Epoch [24/200], Step [11/18], Loss: 0.05678751319646835 Epoch [25/200], Step [1/18], Loss: 0.010233682580292225 Epoch [25/200], Step [11/18], Loss: 0.017964594066143036 Epoch [26/200], Step [1/18], Loss: 0.018377654254436493 Epoch [26/200], Step [11/18], Loss: 0.06426376104354858 Epoch [27/200], Step [1/18], Loss: 0.0289956983178854 Epoch [27/200], Step [11/18], Loss: 0.006790440529584885 Epoch [28/200], Step [1/18], Loss: 0.03905341029167175 Epoch [28/200], Step [11/18], Loss: 0.028408877551555634 Epoch [29/200], Step [1/18], Loss: 0.021818600594997406 Epoch [29/200], Step [11/18], Loss: 0.03449644148349762 Epoch [30/200], Step [1/18], Loss: 0.008687200956046581 Epoch [30/200], Step [11/18], Loss: 0.010629050433635712 Epoch [31/200], Step [1/18], Loss: 0.11608363687992096 Epoch [31/200], Step [11/18], Loss: 0.4328106641769409 Epoch [32/200], Step [1/18], Loss: 0.05060550570487976 Epoch [32/200], Step [11/18], Loss: 0.039930470287799835 Epoch [33/200], Step [1/18], Loss: 0.008874151855707169 Epoch [33/200], Step [11/18], Loss: 0.08507397025823593 Epoch [34/200], Step [1/18], Loss: 0.0058389026671648026 Epoch [34/200], Step [11/18], Loss: 0.0046081929467618465 Epoch [35/200], Step [1/18], Loss: 0.12371867895126343 Epoch [35/200], Step [11/18], Loss: 0.038248419761657715 Epoch [36/200], Step [1/18], Loss: 0.16413459181785583 Epoch [36/200], Step [11/18], Loss: 0.032994143664836884 Epoch [37/200], Step [1/18], Loss: 0.03072711080312729 Epoch [37/200], Step [11/18], Loss: 0.03439655900001526 Epoch [38/200], Step [1/18], Loss: 0.006345064379274845 Epoch [38/200], Step [11/18], Loss: 0.012138884514570236 Epoch [39/200], Step [1/18], Loss: 0.011927744373679161 Epoch [39/200], Step [11/18], Loss: 0.008709677495062351 Epoch [40/200], Step [1/18], Loss: 0.029146356508135796 Epoch [40/200], Step [11/18], Loss: 0.11137421429157257 Epoch [41/200], Step [1/18], Loss: 0.031168609857559204 Epoch [41/200], Step [11/18], Loss: 0.012156662531197071 Epoch [42/200], Step [1/18], Loss: 0.019586797803640366 Epoch [42/200], Step [11/18], Loss: 0.0041570719331502914 Epoch [43/200], Step [1/18], Loss: 0.005333346780389547 Epoch [43/200], Step [11/18], Loss: 0.07366456836462021 Epoch [44/200], Step [1/18], Loss: 0.02151975780725479 Epoch [44/200], Step [11/18], Loss: 0.10591024160385132 Epoch [45/200], Step [1/18], Loss: 0.008960297331213951 Epoch [45/200], Step [11/18], Loss: 0.028149280697107315 Epoch [46/200], Step [1/18], Loss: 0.028821412473917007 Epoch [46/200], Step [11/18], Loss: 0.017873506993055344 Epoch [47/200], Step [1/18], Loss: 0.01558565441519022 Epoch [47/200], Step [11/18], Loss: 0.45754313468933105 Epoch [48/200], Step [1/18], Loss: 0.03192921727895737 Epoch [48/200], Step [11/18], Loss: 0.5914071202278137 Epoch [49/200], Step [1/18], Loss: 0.38782456517219543 Epoch [49/200], Step [11/18], Loss: 0.0011864150874316692 Epoch [50/200], Step [1/18], Loss: 0.055461883544921875 Epoch [50/200], Step [11/18], Loss: 0.10176341235637665 Epoch [51/200], Step [1/18], Loss: 0.015766264870762825 Epoch [51/200], Step [11/18], Loss: 0.003621053881943226 Epoch [52/200], Step [1/18], Loss: 0.018626023083925247 Epoch [52/200], Step [11/18], Loss: 0.004425644408911467 Epoch [53/200], Step [1/18], Loss: 0.007729521952569485 Epoch [53/200], Step [11/18], Loss: 0.0073486268520355225 Epoch [54/200], Step [1/18], Loss: 0.01855909451842308 Epoch [54/200], Step [11/18], Loss: 0.003889220766723156 Epoch [55/200], Step [1/18], Loss: 0.005263946484774351 Epoch [55/200], Step [11/18], Loss: 0.1581307053565979 Epoch [56/200], Step [1/18], Loss: 0.021987076848745346 Epoch [56/200], Step [11/18], Loss: 0.05597549304366112 Epoch [57/200], Step [1/18], Loss: 0.380911260843277 Epoch [57/200], Step [11/18], Loss: 0.03639882430434227 Epoch [58/200], Step [1/18], Loss: 0.009338815696537495 Epoch [58/200], Step [11/18], Loss: 0.0846019983291626 Epoch [59/200], Step [1/18], Loss: 0.026767950505018234 Epoch [59/200], Step [11/18], Loss: 0.11035145819187164 Epoch [60/200], Step [1/18], Loss: 0.027456166222691536 Epoch [60/200], Step [11/18], Loss: 0.014665378257632256 Epoch [61/200], Step [1/18], Loss: 0.0067307911813259125 Epoch [61/200], Step [11/18], Loss: 0.014259519055485725 Epoch [62/200], Step [1/18], Loss: 0.0066595980897545815 Epoch [62/200], Step [11/18], Loss: 0.012577966786921024 Epoch [63/200], Step [1/18], Loss: 0.008954830467700958 Epoch [63/200], Step [11/18], Loss: 0.025002148002386093 Epoch [64/200], Step [1/18], Loss: 0.4726946949958801 Epoch [64/200], Step [11/18], Loss: 0.7237932682037354 Epoch [65/200], Step [1/18], Loss: 0.08082826435565948 Epoch [65/200], Step [11/18], Loss: 0.06651557981967926 Epoch [66/200], Step [1/18], Loss: 0.011343597434461117 Epoch [66/200], Step [11/18], Loss: 0.017044804990291595 Epoch [67/200], Step [1/18], Loss: 0.024660665541887283 Epoch [67/200], Step [11/18], Loss: 0.008989336900413036 Epoch [68/200], Step [1/18], Loss: 0.014209344983100891 Epoch [68/200], Step [11/18], Loss: 1.0901525020599365 Epoch [69/200], Step [1/18], Loss: 0.019023453816771507 Epoch [69/200], Step [11/18], Loss: 0.012965338304638863 Epoch [70/200], Step [1/18], Loss: 0.03433838486671448 Epoch [70/200], Step [11/18], Loss: 0.016142964363098145 Epoch [71/200], Step [1/18], Loss: 0.0070319571532309055 Epoch [71/200], Step [11/18], Loss: 0.0030483559239655733 Epoch [72/200], Step [1/18], Loss: 0.055289510637521744 Epoch [72/200], Step [11/18], Loss: 0.017122216522693634 Epoch [73/200], Step [1/18], Loss: 0.06607518345117569 Epoch [73/200], Step [11/18], Loss: 0.016266677528619766 Epoch [74/200], Step [1/18], Loss: 0.08082333207130432 Epoch [74/200], Step [11/18], Loss: 0.0279204361140728 Epoch [75/200], Step [1/18], Loss: 0.004301637876778841 Epoch [75/200], Step [11/18], Loss: 0.021140452474355698 Epoch [76/200], Step [1/18], Loss: 0.013559591956436634 Epoch [76/200], Step [11/18], Loss: 0.01826651394367218 Epoch [77/200], Step [1/18], Loss: 0.0061076125130057335 Epoch [77/200], Step [11/18], Loss: 0.002152765402570367 Epoch [78/200], Step [1/18], Loss: 0.007121165283024311 Epoch [78/200], Step [11/18], Loss: 0.009206684306263924 Epoch [79/200], Step [1/18], Loss: 0.04074825346469879 Epoch [79/200], Step [11/18], Loss: 0.0065627931617200375 Epoch [80/200], Step [1/18], Loss: 0.01308474037796259 Epoch [80/200], Step [11/18], Loss: 0.10330627858638763 Epoch [81/200], Step [1/18], Loss: 0.013666542246937752 Epoch [81/200], Step [11/18], Loss: 0.03404892981052399 Epoch [82/200], Step [1/18], Loss: 0.00352277560159564 Epoch [82/200], Step [11/18], Loss: 0.02659265138208866 Epoch [83/200], Step [1/18], Loss: 0.003314829431474209 Epoch [83/200], Step [11/18], Loss: 0.03542104363441467 Epoch [84/200], Step [1/18], Loss: 0.1119469627737999 Epoch [84/200], Step [11/18], Loss: 0.01624973490834236 Epoch [85/200], Step [1/18], Loss: 0.013936529867351055 Epoch [85/200], Step [11/18], Loss: 0.001210562651976943 Epoch [86/200], Step [1/18], Loss: 0.004351664334535599 Epoch [86/200], Step [11/18], Loss: 0.016089780256152153 Epoch [87/200], Step [1/18], Loss: 0.0008625991176813841 Epoch [87/200], Step [11/18], Loss: 0.0026758601889014244 Epoch [88/200], Step [1/18], Loss: 0.04442567750811577 Epoch [88/200], Step [11/18], Loss: 0.015446970239281654 Epoch [89/200], Step [1/18], Loss: 0.0005034751957282424 Epoch [89/200], Step [11/18], Loss: 0.005449206568300724 Epoch [90/200], Step [1/18], Loss: 0.2731953561306 Epoch [90/200], Step [11/18], Loss: 0.0018118073930963874 Epoch [91/200], Step [1/18], Loss: 0.0004698720294982195 Epoch [91/200], Step [11/18], Loss: 0.0032033436000347137 Epoch [92/200], Step [1/18], Loss: 0.019340932369232178 Epoch [92/200], Step [11/18], Loss: 0.007545084692537785 Epoch [93/200], Step [1/18], Loss: 0.004510772414505482 Epoch [93/200], Step [11/18], Loss: 0.02083991840481758 Epoch [94/200], Step [1/18], Loss: 0.002590321935713291 Epoch [94/200], Step [11/18], Loss: 0.03413398563861847 Epoch [95/200], Step [1/18], Loss: 0.020762939006090164 Epoch [95/200], Step [11/18], Loss: 0.013762586750090122 Epoch [96/200], Step [1/18], Loss: 0.006758917588740587 Epoch [96/200], Step [11/18], Loss: 0.015020235441625118 Epoch [97/200], Step [1/18], Loss: 0.026167867705225945 Epoch [97/200], Step [11/18], Loss: 0.0027395961806178093 Epoch [98/200], Step [1/18], Loss: 0.0034174644388258457 Epoch [98/200], Step [11/18], Loss: 0.024442892521619797 Epoch [99/200], Step [1/18], Loss: 0.02677156589925289 Epoch [99/200], Step [11/18], Loss: 0.004590878263115883 Epoch [100/200], Step [1/18], Loss: 0.004817421548068523 Epoch [100/200], Step [11/18], Loss: 0.003802280407398939 Epoch [101/200], Step [1/18], Loss: 0.029921934008598328 Epoch [101/200], Step [11/18], Loss: 0.00753698218613863 Epoch [102/200], Step [1/18], Loss: 0.0037281459663063288 Epoch [102/200], Step [11/18], Loss: 0.014354806393384933 Epoch [103/200], Step [1/18], Loss: 0.3014940023422241 Epoch [103/200], Step [11/18], Loss: 0.010186146013438702 Epoch [104/200], Step [1/18], Loss: 0.009418152272701263 Epoch [104/200], Step [11/18], Loss: 0.0036066388711333275 Epoch [105/200], Step [1/18], Loss: 0.010349811986088753 Epoch [105/200], Step [11/18], Loss: 0.008489744737744331 Epoch [106/200], Step [1/18], Loss: 0.013178281486034393 Epoch [106/200], Step [11/18], Loss: 0.016042709350585938 Epoch [107/200], Step [1/18], Loss: 0.31146734952926636 Epoch [107/200], Step [11/18], Loss: 0.0026772483251988888 Epoch [108/200], Step [1/18], Loss: 0.003950158599764109 Epoch [108/200], Step [11/18], Loss: 0.08102487772703171 Epoch [109/200], Step [1/18], Loss: 0.003397746477276087 Epoch [109/200], Step [11/18], Loss: 0.02333853393793106 Epoch [110/200], Step [1/18], Loss: 0.0010620998218655586 Epoch [110/200], Step [11/18], Loss: 0.0037958319298923016 Epoch [111/200], Step [1/18], Loss: 0.001602979376912117 Epoch [111/200], Step [11/18], Loss: 0.5138437151908875 Epoch [112/200], Step [1/18], Loss: 0.009153843857347965 Epoch [112/200], Step [11/18], Loss: 0.05277949944138527 Epoch [113/200], Step [1/18], Loss: 0.008996937423944473 Epoch [113/200], Step [11/18], Loss: 0.026577509939670563 Epoch [114/200], Step [1/18], Loss: 0.07632716000080109 Epoch [114/200], Step [11/18], Loss: 0.6511035561561584 Epoch [115/200], Step [1/18], Loss: 1.1457531452178955 Epoch [115/200], Step [11/18], Loss: 0.6809557676315308 Epoch [116/200], Step [1/18], Loss: 0.2524426281452179 Epoch [116/200], Step [11/18], Loss: 0.428086519241333 Epoch [117/200], Step [1/18], Loss: 0.6725389957427979 Epoch [117/200], Step [11/18], Loss: 0.09876668453216553 Epoch [118/200], Step [1/18], Loss: 1.044818639755249 Epoch [118/200], Step [11/18], Loss: 0.32220691442489624 Epoch [119/200], Step [1/18], Loss: 0.23680479824543 Epoch [119/200], Step [11/18], Loss: 0.4861851930618286 Epoch [120/200], Step [1/18], Loss: 0.30875518918037415 Epoch [120/200], Step [11/18], Loss: 0.6225939989089966 Epoch [121/200], Step [1/18], Loss: 0.049871932715177536 Epoch [121/200], Step [11/18], Loss: 0.8413558006286621 Epoch [122/200], Step [1/18], Loss: 0.36755722761154175 Epoch [122/200], Step [11/18], Loss: 0.07630021870136261 Epoch [123/200], Step [1/18], Loss: 0.051008716225624084 Epoch [123/200], Step [11/18], Loss: 0.6509355902671814 Epoch [124/200], Step [1/18], Loss: 0.0252146627753973 Epoch [124/200], Step [11/18], Loss: 0.5530363917350769 Epoch [125/200], Step [1/18], Loss: 0.05530519038438797 Epoch [125/200], Step [11/18], Loss: 0.25131523609161377 Epoch [126/200], Step [1/18], Loss: 0.01882755197584629 Epoch [126/200], Step [11/18], Loss: 0.29099002480506897 Epoch [127/200], Step [1/18], Loss: 0.28348326683044434 Epoch [127/200], Step [11/18], Loss: 0.08514250814914703 Epoch [128/200], Step [1/18], Loss: 0.02945907786488533 Epoch [128/200], Step [11/18], Loss: 0.07844363152980804 Epoch [129/200], Step [1/18], Loss: 0.01358343381434679 Epoch [129/200], Step [11/18], Loss: 0.27571654319763184 Epoch [130/200], Step [1/18], Loss: 0.10371901839971542 Epoch [130/200], Step [11/18], Loss: 0.03517628461122513 Epoch [131/200], Step [1/18], Loss: 0.00815553218126297 Epoch [131/200], Step [11/18], Loss: 0.09514375776052475 Epoch [132/200], Step [1/18], Loss: 0.45774367451667786 Epoch [132/200], Step [11/18], Loss: 0.3840058743953705 Epoch [133/200], Step [1/18], Loss: 0.06872424483299255 Epoch [133/200], Step [11/18], Loss: 0.03554276004433632 Epoch [134/200], Step [1/18], Loss: 0.3256757855415344 Epoch [134/200], Step [11/18], Loss: 0.04108048602938652 Epoch [135/200], Step [1/18], Loss: 0.013568353839218616 Epoch [135/200], Step [11/18], Loss: 0.03537318855524063 Epoch [136/200], Step [1/18], Loss: 0.08078506588935852 Epoch [136/200], Step [11/18], Loss: 0.16535770893096924 Epoch [137/200], Step [1/18], Loss: 0.07062816619873047 Epoch [137/200], Step [11/18], Loss: 0.23596513271331787 Epoch [138/200], Step [1/18], Loss: 0.017027437686920166 Epoch [138/200], Step [11/18], Loss: 0.00647686468437314 Epoch [139/200], Step [1/18], Loss: 0.013125029392540455 Epoch [139/200], Step [11/18], Loss: 0.27549853920936584 Epoch [140/200], Step [1/18], Loss: 0.007153613492846489 Epoch [140/200], Step [11/18], Loss: 0.017620528116822243 Epoch [141/200], Step [1/18], Loss: 0.0321669727563858 Epoch [141/200], Step [11/18], Loss: 0.028842061758041382 Epoch [142/200], Step [1/18], Loss: 0.01732991263270378 Epoch [142/200], Step [11/18], Loss: 0.08353880792856216 Epoch [143/200], Step [1/18], Loss: 0.01723271794617176 Epoch [143/200], Step [11/18], Loss: 0.019574100151658058 Epoch [144/200], Step [1/18], Loss: 0.03397369384765625 Epoch [144/200], Step [11/18], Loss: 0.10844092816114426 Epoch [145/200], Step [1/18], Loss: 0.3786786198616028 Epoch [145/200], Step [11/18], Loss: 0.1694055199623108 Epoch [146/200], Step [1/18], Loss: 0.11119166761636734 Epoch [146/200], Step [11/18], Loss: 0.17424573004245758 Epoch [147/200], Step [1/18], Loss: 0.15077194571495056 Epoch [147/200], Step [11/18], Loss: 0.5065086483955383 Epoch [148/200], Step [1/18], Loss: 0.1338863968849182 Epoch [148/200], Step [11/18], Loss: 0.41857266426086426 Epoch [149/200], Step [1/18], Loss: 0.14975376427173615 Epoch [149/200], Step [11/18], Loss: 0.1162782609462738 Epoch [150/200], Step [1/18], Loss: 0.3046249747276306 Epoch [150/200], Step [11/18], Loss: 0.2820568382740021 Epoch [151/200], Step [1/18], Loss: 0.1767234355211258 Epoch [151/200], Step [11/18], Loss: 0.5894790291786194 Epoch [152/200], Step [1/18], Loss: 0.0710759088397026 Epoch [152/200], Step [11/18], Loss: 0.2845103144645691 Epoch [153/200], Step [1/18], Loss: 0.007126178592443466 Epoch [153/200], Step [11/18], Loss: 0.1113148108124733 Epoch [154/200], Step [1/18], Loss: 0.04131874442100525 Epoch [154/200], Step [11/18], Loss: 0.06208159029483795 Epoch [155/200], Step [1/18], Loss: 0.11961569637060165 Epoch [155/200], Step [11/18], Loss: 0.08468692749738693 Epoch [156/200], Step [1/18], Loss: 0.21016210317611694 Epoch [156/200], Step [11/18], Loss: 0.020117301493883133 Epoch [157/200], Step [1/18], Loss: 0.31543296575546265 Epoch [157/200], Step [11/18], Loss: 0.03285551816225052 Epoch [158/200], Step [1/18], Loss: 0.025582782924175262 Epoch [158/200], Step [11/18], Loss: 0.22900016605854034 Epoch [159/200], Step [1/18], Loss: 0.3325921893119812 Epoch [159/200], Step [11/18], Loss: 0.8100109100341797 Epoch [160/200], Step [1/18], Loss: 0.006363577675074339 Epoch [160/200], Step [11/18], Loss: 0.022655433043837547 Epoch [161/200], Step [1/18], Loss: 0.094673752784729 Epoch [161/200], Step [11/18], Loss: 0.09117478132247925 Epoch [162/200], Step [1/18], Loss: 0.06463250517845154 Epoch [162/200], Step [11/18], Loss: 0.047544147819280624 Epoch [163/200], Step [1/18], Loss: 0.03960324078798294 Epoch [163/200], Step [11/18], Loss: 0.009391479194164276 Epoch [164/200], Step [1/18], Loss: 0.08041112124919891 Epoch [164/200], Step [11/18], Loss: 0.017049631103873253 Epoch [165/200], Step [1/18], Loss: 0.013496411964297295 Epoch [165/200], Step [11/18], Loss: 0.02232395112514496 Epoch [166/200], Step [1/18], Loss: 0.04993608221411705 Epoch [166/200], Step [11/18], Loss: 0.5434579849243164 Epoch [167/200], Step [1/18], Loss: 0.06688367575407028 Epoch [167/200], Step [11/18], Loss: 0.0397261306643486 Epoch [168/200], Step [1/18], Loss: 0.00531834876164794 Epoch [168/200], Step [11/18], Loss: 0.017009573057293892 Epoch [169/200], Step [1/18], Loss: 0.014699848368763924 Epoch [169/200], Step [11/18], Loss: 0.12913461029529572 Epoch [170/200], Step [1/18], Loss: 0.04674993082880974 Epoch [170/200], Step [11/18], Loss: 0.008987809531390667 Epoch [171/200], Step [1/18], Loss: 0.3470563292503357 Epoch [171/200], Step [11/18], Loss: 0.014212577603757381 Epoch [172/200], Step [1/18], Loss: 0.014295908622443676 Epoch [172/200], Step [11/18], Loss: 0.01740555465221405 Epoch [173/200], Step [1/18], Loss: 0.05029941722750664 Epoch [173/200], Step [11/18], Loss: 0.053891196846961975 Epoch [174/200], Step [1/18], Loss: 0.08363781869411469 Epoch [174/200], Step [11/18], Loss: 0.0013446426019072533 Epoch [175/200], Step [1/18], Loss: 0.05936658754944801 Epoch [175/200], Step [11/18], Loss: 0.7805588245391846 Epoch [176/200], Step [1/18], Loss: 0.029463572427630424 Epoch [176/200], Step [11/18], Loss: 0.33329683542251587 Epoch [177/200], Step [1/18], Loss: 0.002200994174927473 Epoch [177/200], Step [11/18], Loss: 0.09669584780931473 Epoch [178/200], Step [1/18], Loss: 0.013506107032299042 Epoch [178/200], Step [11/18], Loss: 0.021688269451260567 Epoch [179/200], Step [1/18], Loss: 0.005644794087857008 Epoch [179/200], Step [11/18], Loss: 0.24360783398151398 Epoch [180/200], Step [1/18], Loss: 0.5215405225753784 Epoch [180/200], Step [11/18], Loss: 0.03189065679907799 Epoch [181/200], Step [1/18], Loss: 0.04039095342159271 Epoch [181/200], Step [11/18], Loss: 0.04796888679265976 Epoch [182/200], Step [1/18], Loss: 0.02029312402009964 Epoch [182/200], Step [11/18], Loss: 0.027300354093313217 Epoch [183/200], Step [1/18], Loss: 0.00514404708519578 Epoch [183/200], Step [11/18], Loss: 0.014119967818260193 Epoch [184/200], Step [1/18], Loss: 0.03561864793300629 Epoch [184/200], Step [11/18], Loss: 0.004448604304343462 Epoch [185/200], Step [1/18], Loss: 0.19735635817050934 Epoch [185/200], Step [11/18], Loss: 0.03777945786714554 Epoch [186/200], Step [1/18], Loss: 0.05664841830730438 Epoch [186/200], Step [11/18], Loss: 0.026479505002498627 Epoch [187/200], Step [1/18], Loss: 0.005472094751894474 Epoch [187/200], Step [11/18], Loss: 0.0316663533449173 Epoch [188/200], Step [1/18], Loss: 0.007353189866989851 Epoch [188/200], Step [11/18], Loss: 0.0011719940230250359 Epoch [189/200], Step [1/18], Loss: 0.007917560636997223 Epoch [189/200], Step [11/18], Loss: 0.0023582070134580135 Epoch [190/200], Step [1/18], Loss: 0.035708069801330566 Epoch [190/200], Step [11/18], Loss: 0.05112633854150772 Epoch [191/200], Step [1/18], Loss: 0.002874162746593356 Epoch [191/200], Step [11/18], Loss: 0.009168487042188644 Epoch [192/200], Step [1/18], Loss: 0.003728349693119526 Epoch [192/200], Step [11/18], Loss: 0.01515199150890112 Epoch [193/200], Step [1/18], Loss: 0.008820852264761925 Epoch [193/200], Step [11/18], Loss: 0.0008103932486847043 Epoch [194/200], Step [1/18], Loss: 0.001101092784665525 Epoch [194/200], Step [11/18], Loss: 0.0012981765903532505 Epoch [195/200], Step [1/18], Loss: 0.004354500211775303 Epoch [195/200], Step [11/18], Loss: 0.018854297697544098 Epoch [196/200], Step [1/18], Loss: 0.6042110323905945 Epoch [196/200], Step [11/18], Loss: 0.0017237844876945019 Epoch [197/200], Step [1/18], Loss: 0.008842434734106064 Epoch [197/200], Step [11/18], Loss: 0.0016365956980735064 Epoch [198/200], Step [1/18], Loss: 0.15027210116386414 Epoch [198/200], Step [11/18], Loss: 0.024806607514619827 Epoch [199/200], Step [1/18], Loss: 0.34135231375694275 Epoch [199/200], Step [11/18], Loss: 0.15495596826076508 Epoch [200/200], Step [1/18], Loss: 0.7455297112464905 Epoch [200/200], Step [11/18], Loss: 0.0043668486177921295
# 模型保存import paddlefrom paddle.static import InputSpec
path = "./export_model/audionet"paddle.jit.save(
layer=model,
path=path,
input_spec=[InputSpec(shape=[1,64,40])])/opt/conda/envs/python35-paddle120-env/lib/python3.10/site-packages/paddle/nn/layer/norm.py:818: UserWarning: When training, we now always track global mean and variance. warnings.warn(
# 加载模型import paddleimport numpy as np path = "./export_model/audionet"loaded_model = paddle.jit.load(path) loaded_model.eval()
# 模型测试import globfrom paddleaudio import loadfrom paddleaudio.features import LogMelSpectrogramimport os
seq_len = 64speaker = ["杰斯","金克斯","狼母"]
test_audio_list = glob.glob("/home/aistudio/Arcane_3class/test/*.wav")for audio in test_audio_list:
file_name = os.path.basename(audio)
data, sr = load(file=audio, mono=True, dtype='float32') # 单通道,float32音频样本点
x = paddle.to_tensor(data).unsqueeze(0) # [B, L]
log_fbank = feature_extractor(x) # [B, D, T]
log_fbank = log_fbank.squeeze(0) # [D, T]
if log_fbank.shape[1]<seq_len:
pad_value = 0.0
pad = [0, 0, 0, seq_len - log_fbank.shape[1]]
vector = paddle.nn.functional.pad(log_fbank, pad, mode='constant', value=pad_value) else:
vector = log_fbank[:, :seq_len]
vector = np.transpose(vector,(1,0))
vector = paddle.to_tensor(vector)
vector = paddle.unsqueeze(vector, axis=0)
preds = loaded_model(vector)
preds = paddle.nn.functional.softmax(preds, axis=1) print(preds)
index = int(preds.argmax()) print(f"{file_name}识别结果为 {speaker[index]}")Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00001624, 0.00006163, 0.99992216]])
狼母_audio44.wav识别结果为 狼母
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00002558, 0.00004835, 0.99992597]])
狼母_audio45.wav识别结果为 狼母
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00970634, 0.98993659, 0.00035706]])
金克斯_audio42.wav识别结果为 金克斯
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.03899606, 0.94868642, 0.01231745]])
金克斯_audio45.wav识别结果为 金克斯
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.99920183, 0.00043954, 0.00035864]])
杰斯_audio60.wav识别结果为 杰斯
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00002513, 0.00004669, 0.99992812]])
狼母_audio43.wav识别结果为 狼母
Tensor(shape=[1, 3], dtype=float32, place=Place(gpu:0), stop_gradient=False,
[[0.00015131, 0.99914122, 0.00070748]])
金克斯_audio44.wav识别结果为 金克斯以上就是[语音分类入门]基于PaddleSpeech和LSTM网络的双城之战人物语音分类的详细内容,更多请关注php中文网其它相关文章!
每个人都需要一台速度更快、更稳定的 PC。随着时间的推移,垃圾文件、旧注册表数据和不必要的后台进程会占用资源并降低性能。幸运的是,许多工具可以让 Windows 保持平稳运行。
Copyright 2014-2025 https://www.php.cn/ All Rights Reserved | php.cn | 湘ICP备2023035733号