NSGT - Non-Stationary Gabor Transform

class audioflux.NSGT(num=84, radix2_exp=12, samplate=32000, low_fre=None, high_fre=None, bin_per_octave=12, min_len=3, nsgt_filter_bank_type=NSGTFilterBankType.EFFICIENT, scale_type=SpectralFilterBankScaleType.OCTAVE, style_type=SpectralFilterBankStyleType.SLANEY, normal_type=SpectralFilterBankNormalType.BAND_WIDTH)

Non-Stationary Gabor Transform (NSGT)

Parameters

num: int

Number of frequency bins to generate, starting at low_fre.

radix2_exp: int

fft_length=2**radix2_exp

samplate: int

Sampling rate of the incoming audio.

low_fre: float or None

Lowest frequency.

Linear/Linsapce/Mel/Bark/Erb, low_fre>=0. default: 0.0
Octave/Log, low_fre>=32.703. default: 32.703(C1)

high_fre: float or None

Highest frequency. Default is 16000(samplate/2).

Linear is not provided, it is based on samplate / (2 ** radix2_exp).
Octave is not provided, it is based on musical pitch.

bin_per_octave: int

Number of bins per octave.

Only Octave must be provided.

min_len: int

Min len

nsgt_filter_bank_type: NSGTFilterBankType

NSGT filter bank type.

scale_type: SpectralFilterBankScaleType

Spectral filter bank type. It determines the type of spectrogram.

See: type.SpectralFilterBankScaleType

style_type: SpectralFilterBankStyleType

Spectral filter bank style type. It determines the bank type of window.

The GAMMATONE is not supported.

see: type.SpectralFilterBankStyleType

normal_type: SpectralFilterBankNormalType

Spectral filter normal type. It determines the type of normalization.

Must be set to NONE or BAND_WIDTH, the AREA is not supported.
Linear is not provided.

See: type.SpectralFilterBankNormalType

See also

BFT
CWT
PWT

Examples

Read 220Hz audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('220')
>>> audio_arr, sr = af.read(audio_path)
>>> # NSGT can only input fft_length data
>>> # For radix2_exp=15, then fft_length=2**15=32768
>>> audio_arr = audio_arr[..., :32768]

Create NSGT object of Octave

>>> from audioflux.type import (SpectralFilterBankScaleType, SpectralFilterBankStyleType,
>>>                             SpectralFilterBankNormalType)
>>> from audioflux.utils import note_to_hz
>>> obj = af.NSGT(num=84, radix2_exp=15, samplate=sr,
>>>               low_fre=note_to_hz('C1'), bin_per_octave=12,
>>>               scale_type=SpectralFilterBankScaleType.OCTAVE,
>>>               style_type=SpectralFilterBankStyleType.SLANEY,
>>>               normal_type=SpectralFilterBankNormalType.NONE)

Extract spectrogram

>>> import numpy as np
>>> spec_arr = obj.nsgt(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Show spectrogram plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_spec
>>> audio_len = audio_arr.shape[-1]
>>> fig, ax = plt.subplots()
>>> img = fill_spec(spec_arr, axes=ax,
>>>                 x_coords=obj.x_coords(audio_len),
>>>                 y_coords=obj.y_coords(),
>>>                 x_axis='time', y_axis='log',
>>>                 title='NSGT-Octave Spectrogram')
>>> fig.colorbar(img, ax=ax)

Methods

`get_bin_band_arr`()	Get bin band array
`get_fre_band_arr`()	Get an array of frequency bands of different scales.
`get_max_time_length`()	Get max time length
`get_time_length_arr`()	Get time length array
`get_total_time_length`()	Get total time length
`nsgt`(data_arr)	Get spectrogram data
`set_min_length`([min_length])	Set min length
`x_coords`(data_length)	Get the X-axis coordinate
`y_coords`()	Get the Y-axis coordinate.

get_max_time_length()

Get max time length

Returns

out: int

get_total_time_length()

Get total time length

Returns

out: int

get_time_length_arr()

Get time length array

Returns

out: np.ndarray [shape=(time,)]

get_fre_band_arr()

Get an array of frequency bands of different scales. Based on the scale_type determination of the initialization.

Returns

out: np.ndarray [shape=(fre,)]

get_bin_band_arr()

Get bin band array

Returns

out: np.ndarray [shape=(n,)]

set_min_length(min_length=3)

Set min length

Parameters

min_length: int

nsgt(data_arr)

Get spectrogram data

Parameters

data_arr: np.ndarray [shape=(…, 2**radix2_exp)]: Input audio data

Returns

m_data_arr: np.ndarray [shape=(…, fre, time), dtype=np.complex]: The matrix of NSGT

y_coords()

Get the Y-axis coordinate.

Returns

out: np.ndarray [shape=(fre,)]

x_coords(data_length)

Get the X-axis coordinate

Parameters

data_length: int: The length of the data to be calculated.

Returns

out: np.ndarray [shape=(time,)]