WSST - reassign transform for CWT

class audioflux.WSST(num=84, radix2_exp=12, samplate=32000, low_fre=None, high_fre=None, bin_per_octave=12, wavelet_type=WaveletContinueType.MORLET, scale_type=SpectralFilterBankScaleType.OCTAVE, gamma=None, beta=None, thresh=0.001, is_padding=True)

Wavelet synchrosqueezed transform (WSST)

Parameters

num: int

Number of frequency bins to generate

radix2_exp: int

fft_length=2**radix2_exp

samplate: int

Sampling rate of the incoming audio

low_fre: float

Lowest frequency

Linear/Linsapce/Mel/Bark/Erb, low_fre>=0. default: 0.0
Octave/Log, low_fre>=32.703. default: 32.703(C1)

high_fre: float

Highest frequency.

Default is 16000(samplate/2) Octave is not provided, it is based on musical pitch

bin_per_octave: int

Number of bins per octave.

scale_type: SpectralFilterBankScaleType

Spectral filter bank type. It determines the type of spectrogram.

See: type.SpectralFilterBankScaleType

wavelet_type: WaveletContinueType

Wavelet continue type

See: type.WaveletContinueType

Note

Default gamma/beta values for different wavelet_types:

morse: gamma=3 beta=20
morlet: gamma=6 beta=2
bump: gamma=5 beta=0.6
paul: gamma 4
dog: gamma 2 beta 2; must even
mexican: beta 2
hermit: gamma 5 beta 2
ricker: gamma 4

gamma: float or None

gamma value

beta: float or None

beta value

thresh: float

thresh

is_padding: bool

Whether to use padding.

See also

Reassign
Synsq

Examples

Read 220Hz audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('220')
>>> audio_arr, sr = af.read(audio_path)
>>> # WSST can only input fft_length data
>>> # For radix2_exp=12, then fft_length=4096
>>> audio_arr = audio_arr[..., :4096]

Create WSST object of mel

>>> from audioflux.type import SpectralFilterBankScaleType, WaveletContinueType
>>> from audioflux.utils import note_to_hz
>>> obj = af.WSST(num=128, radix2_exp=12, samplate=sr,
>>>               bin_per_octave=12, wavelet_type=WaveletContinueType.MORSE,
>>>               scale_type=SpectralFilterBankScaleType.MEL, is_padding=False)

Extract spectrogram

>>> import numpy as np
>>> wsst_spec_arr, cwt_spec_arr = obj.wsst(audio_arr)
>>> wsst_spec_arr = np.abs(wsst_spec_arr)
>>> cwt_spec_arr = np.abs(cwt_spec_arr)

Show spectrogram plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_spec
>>> # Show CWT
>>> fig, ax = plt.subplots()
>>> img = fill_spec(cwt_spec_arr, axes=ax,
>>>                 x_coords=obj.x_coords(),
>>>                 y_coords=obj.y_coords(),
>>>                 x_axis='time', y_axis='log',
>>>                 title='CWT-Mel')
>>> fig.colorbar(img, ax=ax)
>>>
>>> # Show WSST
>>> fig, ax = plt.subplots()
>>> img = fill_spec(wsst_spec_arr, axes=ax,
>>>                 x_coords=obj.x_coords(),
>>>                 y_coords=obj.y_coords(),
>>>                 x_axis='time', y_axis='log',
>>>                 title='WSST-Mel')
>>> fig.colorbar(img, ax=ax)

Methods

`get_bin_band_arr`()	Get bin band array
`get_fre_band_arr`()	Get an array of frequency bands of different scales.
`set_order`(order)	Set order
`wsst`(data_arr)	Get spectrogram data
`x_coords`()	Get the X-axis coordinate
`y_coords`()	Get the Y-axis coordinate

get_fre_band_arr()

Get an array of frequency bands of different scales. Based on the scale_type determination of the initialization.

Returns

out: np.ndarray [shape=(num,)]

get_bin_band_arr()

Get bin band array

Returns

out: np.ndarray [shape=(num,)]

set_order(order)

Set order

Parameters

order: int, >= 1

Returns

wsst(data_arr)

Get spectrogram data

Parameters

data_arr: np.ndarray [shape=(…, 2**radix2_exp)]: Input audio data

Returns

m_arr1: np.ndarray [shape=(…, fre, time), dtype=np.complex]: The matrix of WSST
m_arr2: np.ndarray [shape=(…, fre, time), dtype=np.complex]: The matrix of origin(CWT)

y_coords()

Get the Y-axis coordinate

Returns

out: np.ndarray

x_coords()

Get the X-axis coordinate

Returns

out: np.ndarray