audioflux.vqt

audioflux.vqt(X, num=84, samplate=32000, low_fre=32.70319566257483, bin_per_octave=12, factor=1.0, beta=0.5, thresh=0.01, window_type=WindowType.HANN, slide_length=None, normal_type=SpectralFilterBankNormalType.AREA, is_scale=True)

Variable-Q transform (VQT)

Note

We recommend using the CQT class, you can use it more flexibly and efficiently.

Parameters
X: np.ndarray [shape=(…, n)]

audio time series.

num: int

Number of frequency bins to generate, starting at low_fre.

Usually: num = octave * bin_per_octave, default: 84 (7 * 12)

samplate: int:

Sampling rate of the incoming audio.

low_fre: float

Lowest frequency. default: 32.703(C1)

bin_per_octave: int

Number of bins per octave.

factor: float

Factor value

beta: float

Beta value

thresh: float

Thresh value

window_type: WindowType

Window type for each frame.

See: type.WindowType

slide_length: int

Window sliding length.

normal_type: SpectralFilterBankNormalType

Spectral filter normal type. It determines the type of normalization.

See: type.SpectralFilterBankNormalType

is_scale: bool

Whether to use scale.

Returns
out: np.ndarray [shape=(…, fre, time)]

The matrix of VQT

fre_band_arr: np:ndarray [shape=(fre,)]

The array of frequency bands

See also

cqcc
vqt
CQT

Examples

Read 220Hz audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('220')
>>> audio_arr, sr = af.read(audio_path)

Extract spectrogram of dB

>>> low_fre = af.utils.note_to_hz('C1')
>>> spec_arr, fre_band_arr = af.vqt(audio_arr, samplate=sr, low_fre=low_fre)
>>> spec_dB_arr = af.utils.power_to_db(spec_arr ** 2)

Show spectrogram plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_spec
>>> import numpy as np
>>>
>>> # calculate x/y-coords
>>> audio_len = audio_arr.shape[-1]
>>> x_coords = np.linspace(0, audio_len/sr, spec_arr.shape[-1] + 1)
>>> y_coords = np.insert(fre_band_arr, 0, low_fre)
>>>
>>> fig, ax = plt.subplots()
>>> img = fill_spec(spec_dB_arr, axes=ax,
>>>                 x_coords=x_coords,
>>>                 y_coords=y_coords,
>>>                 x_axis='time', y_axis='log',
>>>                 title='VQT')
>>> fig.colorbar(img, ax=ax, format="%+2.0f dB")
../_images/audioflux-vqt-1.png