STFT - Short Time Fourier Transform

class audioflux.STFT(radix2_exp=12, window_type=WindowType.RECT, slide_length=1024)

Short-time Fourier transform (STFT).

Parameters
radix2_exp: int

fft_length=2**radix2_exp

window_type: WindowType

Window type for each frame.

See: type.WindowType

slide_length: int

Window sliding length.

See also

BFT

Examples

Read 220Hz audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('220')
>>> audio_arr, sr = af.read(audio_path)

Compute stft and istft

>>> stft_obj = af.STFT(radix2_exp=12, window_type=af.type.WindowType.RECT, slide_length=1024)
>>> spec_arr = stft_obj.stft(audio_arr)
>>> new_audio_arr = stft_obj.istft(spec_arr)

Show plot

>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> ax.set_title('STFT Spectrogram')
>>> img = af.display.fill_spec(np.abs(spec_arr), axes=ax,
>>>                            y_coords=stft_obj.y_coords(sr),
>>>                            x_coords=stft_obj.x_coords(audio_arr.shape[-1], sr),
>>>                            y_axis='log', x_axis='time')
>>> fig.colorbar(img, ax=ax)
>>> fig, axes = plt.subplots(nrows=2, sharex=True, sharey=True)
>>> ax = af.display.fill_wave(audio_arr, axes=axes[0])
>>> ax.set_title('Original')
>>> ax = af.display.fill_wave(new_audio_arr, axes=axes[1])
>>> ax.set_title('ISTFT Result')
../_images/stft-1_00.png
../_images/stft-1_01.png

Methods

cal_data_length(time_length)

Calculate the length of the audio data from the frame length.

cal_time_length(data_length)

Calculate the length of a frame from audio data.

enable_padding([flag])

Whether to enable padding.

get_window_data_arr()

Get window data array.

istft(m_data_arr[, method_type])

Calculate ISTFT (Inverse Short-Time Fourier Transform) data.

set_padding([position_type, mode_type, ...])

Set padding parameters.

set_slide_length(slide_length)

Set the slide length.

stft(data_arr)

Calculate STFT (Short-Time Fourier Transform) data.

use_window_data_arr(data_arr)

Custom window data array.

x_coords(data_length[, samplate])

Get the X-axis coordinate

y_coords([samplate])

Get the Y-axis coordinate

set_slide_length(slide_length)

Set the slide length.

Parameters
slide_length: int

Window sliding length.

set_padding(position_type=PaddingPositionType.CENTER, mode_type=PaddingModeType.CONSTANT, value1=0.0, value2=0.0)

Set padding parameters.

Before calling this function, you must use enable_padding and set is_pad to True.

Parameters
position_type: PaddingPositionType

Padding position type.

See: type.PaddingPositionType

mode_type: PaddingModeType

Padding mode type.

See: type.PaddingModeType

value1: float

Padding value1.

If mode_type is CONSTANT and position_type is CENTER, value1 is the left padding value, length is fft_length // 2. If mode_type is CONSTANT and position_type is LEFT, value1 is the left padding value, length is fft_length // 2. If mode_type is CONSTANT and position_type is RIGHT, value1 is the right padding value, length is fft_length // 2. Other modes are not used.

value2: float

Padding value2.

If mode_type is CONSTANT and position_type is CENTER, value2 is the right padding value, length is fft_length // 2. Other modes are not used.

Returns
use_window_data_arr(data_arr)

Custom window data array.

Default window data array is generated by window_type.

Parameters
data_arr: np.ndarray [shape=(fft_length,)]

Window data array.

Returns
get_window_data_arr()

Get window data array.

Returns
out: np.ndarray [shape=(fft_length,)]

Window data array.

enable_padding(flag=False)

Whether to enable padding.

Default is False.

Parameters
flag: bool

Whether to enable padding.

Returns
cal_time_length(data_length)

Calculate the length of a frame from audio data.

  • fft_length = 2 ** radix2_exp

  • (data_length - fft_length) // slide_length + 1

Parameters
data_length: int

The length of the data to be calculated.

Returns
out: int
cal_data_length(time_length)

Calculate the length of the audio data from the frame length.

Parameters
time_length: int

The length of the frame to be calculated.

Returns
out: int
stft(data_arr)

Calculate STFT (Short-Time Fourier Transform) data.

Parameters
data_arr: np.ndarray [shape=(…, n)]

Input audio data

Returns
out: np.ndarray [shape=(…, fft_length // 2 + 1, time_length), dtype=np.complex64]

STFT data

istft(m_data_arr, method_type=0)

Calculate ISTFT (Inverse Short-Time Fourier Transform) data.

Parameters
m_data_arr: np.ndarray [shape=(…, fft_length // 2 + 1, time_length), dtype=np.complex64]

Input STFT data

method_type: int

0: weight(default) 1: overlap-add

Returns
out: np.ndarray [shape=(…, data_length), dtype=np.float32]

ISTFT data

y_coords(samplate=32000)

Get the Y-axis coordinate

Parameters
samplate: int

Sampling rate of the incoming audio.

Returns
out: np.ndarray [shape=(fre,)]
x_coords(data_length, samplate=32000)

Get the X-axis coordinate

Parameters
data_length: int

The length of the data to be calculated.

samplate: int

Sampling rate of the incoming audio.

Returns
out: np.ndarray [shape=(time,)]