NSGT - Non-Stationary Gabor Transform
- class audioflux.NSGT(num=84, radix2_exp=12, samplate=32000, low_fre=None, high_fre=None, bin_per_octave=12, min_len=3, nsgt_filter_bank_type=NSGTFilterBankType.EFFICIENT, scale_type=SpectralFilterBankScaleType.OCTAVE, style_type=SpectralFilterBankStyleType.SLANEY, normal_type=SpectralFilterBankNormalType.BAND_WIDTH)
Non-Stationary Gabor Transform (NSGT)
- Parameters
- num: int
Number of frequency bins to generate, starting at low_fre.
- radix2_exp: int
fft_length=2**radix2_exp
- samplate: int
Sampling rate of the incoming audio.
- low_fre: float or None
Lowest frequency.
Linear/Linsapce/Mel/Bark/Erb, low_fre>=0. default: 0.0
Octave/Log, low_fre>=32.703. default: 32.703(C1)
- high_fre: float or None
Highest frequency. Default is 16000(samplate/2).
Linear is not provided, it is based on
samplate / (2 ** radix2_exp)
.Octave is not provided, it is based on musical pitch.
- bin_per_octave: int
Number of bins per octave.
Only Octave must be provided.
- min_len: int
Min len
- nsgt_filter_bank_type: NSGTFilterBankType
NSGT filter bank type.
- scale_type: SpectralFilterBankScaleType
Spectral filter bank type. It determines the type of spectrogram.
- style_type: SpectralFilterBankStyleType
Spectral filter bank style type. It determines the bank type of window.
The GAMMATONE is not supported.
- normal_type: SpectralFilterBankNormalType
Spectral filter normal type. It determines the type of normalization.
Must be set to NONE or BAND_WIDTH, the AREA is not supported.
Linear is not provided.
Examples
Read 220Hz audio data
>>> import audioflux as af >>> audio_path = af.utils.sample_path('220') >>> audio_arr, sr = af.read(audio_path) >>> # NSGT can only input fft_length data >>> # For radix2_exp=15, then fft_length=2**15=32768 >>> audio_arr = audio_arr[..., :32768]
Create NSGT object of Octave
>>> from audioflux.type import (SpectralFilterBankScaleType, SpectralFilterBankStyleType, >>> SpectralFilterBankNormalType) >>> from audioflux.utils import note_to_hz >>> obj = af.NSGT(num=84, radix2_exp=15, samplate=sr, >>> low_fre=note_to_hz('C1'), bin_per_octave=12, >>> scale_type=SpectralFilterBankScaleType.OCTAVE, >>> style_type=SpectralFilterBankStyleType.SLANEY, >>> normal_type=SpectralFilterBankNormalType.NONE)
Extract spectrogram
>>> import numpy as np >>> spec_arr = obj.nsgt(audio_arr) >>> spec_arr = np.abs(spec_arr)
Show spectrogram plot
>>> import matplotlib.pyplot as plt >>> from audioflux.display import fill_spec >>> audio_len = audio_arr.shape[-1] >>> fig, ax = plt.subplots() >>> img = fill_spec(spec_arr, axes=ax, >>> x_coords=obj.x_coords(audio_len), >>> y_coords=obj.y_coords(), >>> x_axis='time', y_axis='log', >>> title='NSGT-Octave Spectrogram') >>> fig.colorbar(img, ax=ax)
Methods
Get bin band array
Get an array of frequency bands of different scales.
Get max time length
Get time length array
Get total time length
nsgt
(data_arr)Get spectrogram data
set_min_length
([min_length])Set min length
x_coords
(data_length)Get the X-axis coordinate
y_coords
()Get the Y-axis coordinate.
- get_max_time_length()
Get max time length
- Returns
- out: int
- get_total_time_length()
Get total time length
- Returns
- out: int
- get_time_length_arr()
Get time length array
- Returns
- out: np.ndarray [shape=(time,)]
- get_fre_band_arr()
Get an array of frequency bands of different scales. Based on the scale_type determination of the initialization.
- Returns
- out: np.ndarray [shape=(fre,)]
- get_bin_band_arr()
Get bin band array
- Returns
- out: np.ndarray [shape=(n,)]
- set_min_length(min_length=3)
Set min length
- Parameters
- min_length: int
- nsgt(data_arr)
Get spectrogram data
- Parameters
- data_arr: np.ndarray [shape=(…, 2**radix2_exp)]
Input audio data
- Returns
- m_data_arr: np.ndarray [shape=(…, fre, time), dtype=np.complex]
The matrix of NSGT
- y_coords()
Get the Y-axis coordinate.
- Returns
- out: np.ndarray [shape=(fre,)]
- x_coords(data_length)
Get the X-axis coordinate
- Parameters
- data_length: int
The length of the data to be calculated.
- Returns
- out: np.ndarray [shape=(time,)]