且构网

分享程序员开发的那些事...
且构网 - 分享程序员编程开发的那些事

无法将多线程用于librosa melspectrogram

更新时间:2022-05-16 23:02:28

我建议使用 joblib 与librosa并行处理.我相信librosa在内部使用它,因此可以避免一些冲突.以下是一个有效的示例,该示例基于我经常用于处理约1万个文件的代码.

I recommend using joblib to parallel process with librosa. I believe librosa is using it internally, so this might avoid some conflicts. Below is a working example, based on code that I regularly use to process some 10k files.

import os.path
import joblib
import librosa
import numpy

def compute(inpath, outpath):
    y, sr = librosa.load(inpath)
    S = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=128, fmax=8000)
    numpy.save(outpath, S)
    return outpath

out_dir = 'temp/'
n_jobs=8
verbose=1

# as an reproducable example just processes the same input file
# but making sure to give them unique output names
inputs = [ librosa.util.example_audio_file() ] * 10
outputs = [ os.path.join(out_dir, '{}.npy'.format(n)) for n in range(len(inputs)) ]

jobs = [ joblib.delayed(compute)(i, o) for i,o in zip(inputs, outputs) ]
out = joblib.Parallel(n_jobs=n_jobs, verbose=verbose)(jobs)

print(out)

输出

[Parallel(n_jobs=8)]: Using backend LokyBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done   6 out of  10 | elapsed:   10.4s remaining:    6.9s
[Parallel(n_jobs=8)]: Done  10 out of  10 | elapsed:   13.2s finished
['temp/0.npy', 'temp/1.npy', 'temp/2.npy', 'temp/3.npy', 'temp/4.npy', 'temp/5.npy', 'temp/6.npy', 'temp/7.npy', 'temp/8.npy', 'temp/9.npy']