STFT - Short Time Fourier Transform

class audioflux.STFT(radix2_exp=12, window_type=WindowType.RECT, slide_length=1024)

Short-time Fourier transform (STFT).

Parameters

radix2_exp: int

fft_length=2**radix2_exp

window_type: WindowType

Window type for each frame.

See: type.WindowType

slide_length: int

Window sliding length.

See also

BFT

Examples

Read 220Hz audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('220')
>>> audio_arr, sr = af.read(audio_path)

Compute stft and istft

>>> stft_obj = af.STFT(radix2_exp=12, window_type=af.type.WindowType.RECT, slide_length=1024)
>>> spec_arr = stft_obj.stft(audio_arr)
>>> new_audio_arr = stft_obj.istft(spec_arr)

Show plot

>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> ax.set_title('STFT Spectrogram')
>>> img = af.display.fill_spec(np.abs(spec_arr), axes=ax,
>>>                            y_coords=stft_obj.y_coords(sr),
>>>                            x_coords=stft_obj.x_coords(audio_arr.shape[-1], sr),
>>>                            y_axis='log', x_axis='time')
>>> fig.colorbar(img, ax=ax)

>>> fig, axes = plt.subplots(nrows=2, sharex=True, sharey=True)
>>> ax = af.display.fill_wave(audio_arr, axes=axes[0])
>>> ax.set_title('Original')
>>> ax = af.display.fill_wave(new_audio_arr, axes=axes[1])
>>> ax.set_title('ISTFT Result')

Methods

`cal_data_length`(time_length)	Calculate the length of the audio data from the frame length.
`cal_time_length`(data_length)	Calculate the length of a frame from audio data.
`enable_padding`([flag])	Whether to enable padding.
`get_window_data_arr`()	Get window data array.
`istft`(m_data_arr[, method_type])	Calculate ISTFT (Inverse Short-Time Fourier Transform) data.
`set_padding`([position_type, mode_type, ...])	Set padding parameters.
`set_slide_length`(slide_length)	Set the slide length.
`stft`(data_arr)	Calculate STFT (Short-Time Fourier Transform) data.
`use_window_data_arr`(data_arr)	Custom window data array.
`x_coords`(data_length[, samplate])	Get the X-axis coordinate
`y_coords`([samplate])	Get the Y-axis coordinate

set_slide_length(slide_length)

Set the slide length.

Parameters

slide_length: int: Window sliding length.

set_padding(position_type=PaddingPositionType.CENTER, mode_type=PaddingModeType.CONSTANT, value1=0.0, value2=0.0)

Set padding parameters.

Before calling this function, you must use enable_padding and set is_pad to True.

Parameters

position_type: PaddingPositionType

Padding position type.

See: type.PaddingPositionType

mode_type: PaddingModeType

Padding mode type.

See: type.PaddingModeType

value1: float

Padding value1.

If mode_type is CONSTANT and position_type is CENTER, value1 is the left padding value, length is fft_length // 2. If mode_type is CONSTANT and position_type is LEFT, value1 is the left padding value, length is fft_length // 2. If mode_type is CONSTANT and position_type is RIGHT, value1 is the right padding value, length is fft_length // 2. Other modes are not used.

value2: float

Padding value2.

If mode_type is CONSTANT and position_type is CENTER, value2 is the right padding value, length is fft_length // 2. Other modes are not used.

Returns

use_window_data_arr(data_arr)

Custom window data array.

Default window data array is generated by window_type.

Parameters

data_arr: np.ndarray [shape=(fft_length,)]: Window data array.

Returns

get_window_data_arr()

Get window data array.

Returns

out: np.ndarray [shape=(fft_length,)]: Window data array.

enable_padding(flag=False)

Whether to enable padding.

Default is False.

Parameters

flag: bool: Whether to enable padding.

Returns

cal_time_length(data_length)

Calculate the length of a frame from audio data.

fft_length = 2 ** radix2_exp
(data_length - fft_length) // slide_length + 1

Parameters

data_length: int: The length of the data to be calculated.

Returns

out: int

cal_data_length(time_length)

Calculate the length of the audio data from the frame length.

Parameters

time_length: int: The length of the frame to be calculated.

Returns

out: int

stft(data_arr)

Calculate STFT (Short-Time Fourier Transform) data.

Parameters

data_arr: np.ndarray [shape=(…, n)]: Input audio data

Returns

out: np.ndarray [shape=(…, fft_length // 2 + 1, time_length), dtype=np.complex64]: STFT data

istft(m_data_arr, method_type=0)

Calculate ISTFT (Inverse Short-Time Fourier Transform) data.

Parameters

m_data_arr: np.ndarray [shape=(…, fft_length // 2 + 1, time_length), dtype=np.complex64]: Input STFT data
method_type: int: 0: weight(default) 1: overlap-add

Returns

out: np.ndarray [shape=(…, data_length), dtype=np.float32]: ISTFT data

y_coords(samplate=32000)

Get the Y-axis coordinate

Parameters

samplate: int: Sampling rate of the incoming audio.

Returns

out: np.ndarray [shape=(fre,)]

x_coords(data_length, samplate=32000)

Get the X-axis coordinate

Parameters

data_length: int: The length of the data to be calculated.
samplate: int: Sampling rate of the incoming audio.

Returns

out: np.ndarray [shape=(time,)]