WSST - reassign transform for CWT

class audioflux.WSST(num=84, radix2_exp=12, samplate=32000, low_fre=None, high_fre=None, bin_per_octave=12, wavelet_type=WaveletContinueType.MORLET, scale_type=SpectralFilterBankScaleType.OCTAVE, gamma=None, beta=None, thresh=0.001, is_padding=True)

Wavelet synchrosqueezed transform (WSST)

Parameters
num: int

Number of frequency bins to generate

radix2_exp: int

fft_length=2**radix2_exp

samplate: int

Sampling rate of the incoming audio

low_fre: float

Lowest frequency

  • Linear/Linsapce/Mel/Bark/Erb, low_fre>=0. default: 0.0

  • Octave/Log, low_fre>=32.703. default: 32.703(C1)

high_fre: float

Highest frequency.

Default is 16000(samplate/2) Octave is not provided, it is based on musical pitch

bin_per_octave: int

Number of bins per octave.

scale_type: SpectralFilterBankScaleType

Spectral filter bank type. It determines the type of spectrogram.

See: type.SpectralFilterBankScaleType

wavelet_type: WaveletContinueType

Wavelet continue type

See: type.WaveletContinueType

Note

Default gamma/beta values for different wavelet_types:

  • morse: gamma=3 beta=20

  • morlet: gamma=6 beta=2

  • bump: gamma=5 beta=0.6

  • paul: gamma 4

  • dog: gamma 2 beta 2; must even

  • mexican: beta 2

  • hermit: gamma 5 beta 2

  • ricker: gamma 4

gamma: float or None

gamma value

beta: float or None

beta value

thresh: float

thresh

is_padding: bool

Whether to use padding.

See also

Reassign
Synsq

Examples

Read 220Hz audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('220')
>>> audio_arr, sr = af.read(audio_path)
>>> # WSST can only input fft_length data
>>> # For radix2_exp=12, then fft_length=4096
>>> audio_arr = audio_arr[..., :4096]

Create WSST object of mel

>>> from audioflux.type import SpectralFilterBankScaleType, WaveletContinueType
>>> from audioflux.utils import note_to_hz
>>> obj = af.WSST(num=128, radix2_exp=12, samplate=sr,
>>>               bin_per_octave=12, wavelet_type=WaveletContinueType.MORSE,
>>>               scale_type=SpectralFilterBankScaleType.MEL, is_padding=False)

Extract spectrogram

>>> import numpy as np
>>> wsst_spec_arr, cwt_spec_arr = obj.wsst(audio_arr)
>>> wsst_spec_arr = np.abs(wsst_spec_arr)
>>> cwt_spec_arr = np.abs(cwt_spec_arr)

Show spectrogram plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_spec
>>> # Show CWT
>>> fig, ax = plt.subplots()
>>> img = fill_spec(cwt_spec_arr, axes=ax,
>>>                 x_coords=obj.x_coords(),
>>>                 y_coords=obj.y_coords(),
>>>                 x_axis='time', y_axis='log',
>>>                 title='CWT-Mel')
>>> fig.colorbar(img, ax=ax)
>>>
>>> # Show WSST
>>> fig, ax = plt.subplots()
>>> img = fill_spec(wsst_spec_arr, axes=ax,
>>>                 x_coords=obj.x_coords(),
>>>                 y_coords=obj.y_coords(),
>>>                 x_axis='time', y_axis='log',
>>>                 title='WSST-Mel')
>>> fig.colorbar(img, ax=ax)
../_images/wsst-1_00.png
../_images/wsst-1_01.png

Methods

get_bin_band_arr()

Get bin band array

get_fre_band_arr()

Get an array of frequency bands of different scales.

set_order(order)

Set order

wsst(data_arr)

Get spectrogram data

x_coords()

Get the X-axis coordinate

y_coords()

Get the Y-axis coordinate

get_fre_band_arr()

Get an array of frequency bands of different scales. Based on the scale_type determination of the initialization.

Returns
out: np.ndarray [shape=(num,)]
get_bin_band_arr()

Get bin band array

Returns
out: np.ndarray [shape=(num,)]
set_order(order)

Set order

Parameters
order: int, >= 1
Returns
wsst(data_arr)

Get spectrogram data

Parameters
data_arr: np.ndarray [shape=(…, 2**radix2_exp)]

Input audio data

Returns
m_arr1: np.ndarray [shape=(…, fre, time), dtype=np.complex]

The matrix of WSST

m_arr2: np.ndarray [shape=(…, fre, time), dtype=np.complex]

The matrix of origin(CWT)

y_coords()

Get the Y-axis coordinate

Returns
out: np.ndarray
x_coords()

Get the X-axis coordinate

Returns
out: np.ndarray