Spectral

class audioflux.Spectral(num, fre_band_arr)

Spectrum feature, supports all spectrum types.

Parameters
num: int

Number of frequency bins to generate. It must be the same as the num parameter of the transformation (same as the spectrogram matrix).

fre_band_arr: np.ndarray [shape=(n_fre,)]

The array of frequency bands. Obtained by calling the get_fre_band_arr() method of the transformation.

Methods

band_width(m_data_arr[, p])

Compute the spectral band_width feature.

broadband(m_data_arr[, threshold])

Compute the spectral broadband feature.

cd(m_data_arr, m_phase_arr)

Compute the spectral cd feature.

centroid(m_data_arr)

Compute the spectral centroid feature.

crest(m_data_arr)

Compute the spectral crest feature.

decrease(m_data_arr)

Compute the spectral decrease feature.

eef(m_data_arr[, is_norm])

Compute the spectral eef feature.

eer(m_data_arr[, is_norm, gamma])

Compute the spectral eer feature.

energy(m_data_arr[, is_log, gamma])

Compute the spectral energy feature.

entropy(m_data_arr[, is_norm])

Compute the spectral entropy feature.

flatness(m_data_arr)

Compute the spectral flatness feature.

flux(m_data_arr[, step, p, is_positive, ...])

Compute the spectral flux feature.

hfc(m_data_arr)

Compute the spectral hfc feature.

kurtosis(m_data_arr)

Compute the spectral kurtosis feature.

max(m_data_arr)

Compute the spectral max feature.

mean(m_data_arr)

Compute the spectral mean feature.

mkl(m_data_arr[, tp])

Compute the spectral mkl feature.

novelty(m_data_arr[, step, threshold, ...])

Compute the spectral novelty feature.

nwpd(m_data_arr, m_phase_arr)

Compute the spectral nwpd feature.

pd(m_data_arr, m_phase_arr)

Compute the spectral pd feature.

rcd(m_data_arr, m_phase_arr)

Compute the spectral rcd feature.

rms(m_data_arr)

Compute the spectral rms feature.

rolloff(m_data_arr[, threshold])

Compute the spectral rolloff feature.

sd(m_data_arr[, step, is_positive])

Compute the spectral sd feature.

set_edge(start, end)

Set edge

set_edge_arr(index_arr)

Set edge array

set_time_length(time_length)

Set time length

sf(m_data_arr[, step, is_positive])

Compute the spectral sf feature.

skewness(m_data_arr)

Compute the spectral skewness feature.

slope(m_data_arr)

Compute the spectral slope feature.

spread(m_data_arr)

Compute the spectral spread feature.

var(m_data_arr)

Compute the spectral var feature.

wpd(m_data_arr, m_phase_arr)

Compute the spectral wpd feature.

set_time_length(time_length)

Set time length

Parameters
time_length: int
set_edge(start, end)

Set edge

Parameters
start: int

0 ~ end

end: int

start ~ num-1

set_edge_arr(index_arr)

Set edge array

Parameters
index_arr: np.ndarray [shape=(n,), dtype=np.int32]

fre index array

flatness(m_data_arr)

Compute the spectral flatness feature.

\(\qquad flatness=\frac{\left ( \prod_{k=b_1}^{b_2} s_k \right)^{ \frac{1}{b_2-b_1} } } {\frac{1}{b_2-b_1} \sum_{ k=b_1 }^{b_2} s_k}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
flatness: np.ndarray [shape=(…, time)]

flatness frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract flatness feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> flatness_arr = spectral_obj.flatness(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(flatness_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, flatness_arr, axes=ax[1], label='flatness')
../_images/spectral-1.png
flux(m_data_arr, step=1, p=2, is_positive=False, is_exp=False, tp=0)

Compute the spectral flux feature.

\(\qquad flux(t)=\left( \sum_{k=b_1}^{b_2} |s_k(t)-s_k(t-1) |^{p} \right)^{\frac{1}{p}}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • In general \(s_k(t) \geq s_k(t-1)\) participate in the calculation

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

step: int

Compute time axis steps, like 1/2/3/…

p: int, 1 or 2

norm: 1 abs; 2 pow

is_positive: bool

Whether to set negative numbers to 0

is_exp: bool

Whether to exp

tp: int, 0 or 1

0 sum 1 mean

Returns
flux: np.ndarray [shape=(…, time)]

flux frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract flux feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> flux_arr = spectral_obj.flux(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(flux_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, flux_arr, axes=ax[1], label='flux')
../_images/spectral-2.png
rolloff(m_data_arr, threshold=0.95)

Compute the spectral rolloff feature.

\(\qquad \sum_{k=b_1}^{i}|s_k| \geq \eta \sum_{k=b_1}^{b_2}s_k\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • \(\eta \in (0,1)\), generally take 0.95 or 0.85, satisfy the condition \(i\) get \(f_i\) rolloff frequency

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

threshold: float, [0,1]

rolloff threshold. Generally take 0.95 or 0.85.

Returns
rolloff: np.ndarray [shape=(…, time)]

rolloff frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract rolloff feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> rolloff_arr = spectral_obj.rolloff(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(rolloff_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, rolloff_arr, axes=ax[1], label='rolloff')
../_images/spectral-3.png
centroid(m_data_arr)

Compute the spectral centroid feature.

\(\qquad \mu_1=\frac{\sum_{ k=b_1 }^{b_2} f_ks_k } {\sum_{k=b_1}^{b_2} s_k }\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(f_k\) is in Hz

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
centroid: np.ndarray [shape=(…, time)]

centroid frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract centroid feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> centroid_arr = spectral_obj.centroid(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(centroid_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, centroid_arr, axes=ax[1], label='centroid')
../_images/spectral-4.png
spread(m_data_arr)

Compute the spectral spread feature.

\(\qquad \mu_2=\sqrt{\frac{\sum_{ k=b_1 }^{b_2} (f_k-\mu_1)^2 s_k } {\sum_{k=b_1}^{b_2} s_k } }\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(f_k\) is in Hz

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • \(u_1\): Spectral.centroid

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
spread: np.ndarray [shape=(…, time)]

spread frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract spread feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> spread_arr = spectral_obj.spread(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(spread_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, spread_arr, axes=ax[1], label='spread')
../_images/spectral-5.png
skewness(m_data_arr)

Compute the spectral skewness feature.

\(\qquad \mu_3=\frac{\sum_{ k=b_1 }^{b_2} (f_k-\mu_1)^3 s_k } {(\mu_2)^3 \sum_{k=b_1}^{b_2} s_k }\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(f_k\) is in Hz

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • \(u_1\): Spectral.centroid

  • \(u_2\): Spectral.spread

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
skewness: np.ndarray [shape=(…, time)]

skewness frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract skewness feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> skewness_arr = spectral_obj.skewness(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(skewness_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, skewness_arr, axes=ax[1], label='skewness')
../_images/spectral-6.png
kurtosis(m_data_arr)

Compute the spectral kurtosis feature.

\(\qquad \mu_4=\frac{\sum_{ k=b_1 }^{b_2} (f_k-\mu_1)^4 s_k } {(\mu_2)^4 \sum_{k=b_1}^{b_2} s_k }\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(f_k\) is in Hz

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • \(u_1\): Spectral.centroid

  • \(u_2\): Spectral.spread

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
kurtosis: np.ndarray [shape=(…, time)]

kurtosis frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract kurtosis feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> kurtosis_arr = spectral_obj.kurtosis(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(kurtosis_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, kurtosis_arr, axes=ax[1], label='kurtosis')
../_images/spectral-7.png
entropy(m_data_arr, is_norm=False)

Compute the spectral entropy feature.

Set: \(p_k=\frac{s_k}{\sum_{k=b_1}^{b_2}s_k}\)

\(\qquad entropy1= \frac{-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)} {\log(b_2-b_1)}\)

Or

\(\qquad entropy2= {-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

is_norm: bool

Whether to norm

Returns
entropy: np.ndarray [shape=(…, time)]

entropy frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract entropy feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> entropy_arr = spectral_obj.entropy(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(entropy_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, entropy_arr, axes=ax[1], label='entropy')
../_images/spectral-8.png
crest(m_data_arr)

Compute the spectral crest feature.

\(\qquad crest =\frac{max(s_{k\in_{[b_1,b_2]} }) } {\frac{1}{b_2-b_1} \sum_{ k=b_1 }^{b_2} s_k}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
crest: np.ndarray [shape=(…, time)]

crest frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract crest feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> crest_arr = spectral_obj.crest(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(crest_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, crest_arr, axes=ax[1], label='crest')
../_images/spectral-9.png
slope(m_data_arr)

Compute the spectral slope feature.

\(\qquad slope=\frac{ \sum_{k=b_1}^{b_2}(f_k-\mu_f)(s_k-\mu_s) } { \sum_{k=b_1}^{b_2}(f_k-\mu_f)^2 }\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(f_k\) is in Hz

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • \(\mu_f\): average frequency value

  • \(\mu_s\): average spectral value

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
slope: np.ndarray [shape=(…, time)]

slope frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract slope feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> slope_arr = spectral_obj.slope(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(slope_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, slope_arr, axes=ax[1], label='slope')
../_images/spectral-10.png
decrease(m_data_arr)

Compute the spectral decrease feature.

\(\qquad decrease=\frac { \sum_{k=b_1+1}^{b_2} \frac {s_k-s_{b_1}}{k-1} } { \sum_{k=b_1+1}^{b_2} s_k }\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
decrease: np.ndarray [shape=(…, time)]

decrease frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract decrease feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> decrease_arr = spectral_obj.decrease(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(decrease_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, decrease_arr, axes=ax[1], label='decrease')
../_images/spectral-11.png
band_width(m_data_arr, p=2)

Compute the spectral band_width feature.

\(\qquad bandwidth=\left(\sum_{k=b_1}^{b_2} s_k(f_k-centroid)^p \right)^{\frac{1}{p}}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(f_k\) is in Hz

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • centroid: Spectral.centroid

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

p: int, 1 or 2

norm: 1 abs; 2 pow

Returns
band_width: np.ndarray [shape=(…, time)]

band_width frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract band_width feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> band_width_arr = spectral_obj.band_width(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(band_width_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, band_width_arr, axes=ax[1], label='band_width')
../_images/spectral-12.png
rms(m_data_arr)

Compute the spectral rms feature.

\(\qquad rms=\sqrt{ \frac{1}{N} \sum_{n=1}^N x^2[n] }=\sqrt {\frac{1}{N^2}\sum_{m=1}^N |X[m]|^2 }\)

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
rms: np.ndarray [shape=(…. time)]

rms frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract rms feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> rms_arr = spectral_obj.rms(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(rms_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, rms_arr, axes=ax[1], label='rms')
../_images/spectral-13.png
energy(m_data_arr, is_log=False, gamma=10.0)

Compute the spectral energy feature.

\(\qquad energy=\sum_{n=1}^N x^2[n] =\frac{1}{N}\sum_{m=1}^N |X[m]|^2\)

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

is_log: bool

Whether to log

gamma: float

energy gamma value.

Returns
energy: np.ndarray [shape=(…, time)]

energy frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract energy feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> energy_arr = spectral_obj.energy(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(energy_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, energy_arr, axes=ax[1], label='energy')
../_images/spectral-14.png
hfc(m_data_arr)

Compute the spectral hfc feature.

\(\qquad hfc(t)=\frac{\sum_{k=b_1}^{b_2} s_k(t)k }{b_2-b_1+1}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
hfc: np.ndarray [shape=(…, time)]

hfc frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract hfc feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> hfc_arr = spectral_obj.hfc(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(hfc_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, hfc_arr, axes=ax[1], label='hfc')
../_images/spectral-15.png
sd(m_data_arr, step=1, is_positive=False)

Compute the spectral sd feature.

\(\qquad sd(t)=flux(t)\)

satisfies the calculation of \(s_k(t) \ge s_k(t-1)\), \(p=2\),the result is not \(1/p\)

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • flux: Spectral.flux

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

step: int

Compute time axis steps, like 1/2/3/…

is_positive: bool

Whether to set negative numbers to 0

Returns
sd: np.ndarray [shape=(…, time)]

sd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract sd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> sd_arr = spectral_obj.sd(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(sd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, sd_arr, axes=ax[1], label='sd')
../_images/spectral-16.png
sf(m_data_arr, step=1, is_positive=False)

Compute the spectral sf feature.

\(\qquad sf(t)=flux(t)\)

satisfies the calculation of \(s_k(t) \ge s_k(t-1)\), \(p=1\)

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

  • flux: Spectral.flux

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

step: int

Compute time axis steps, like 1/2/3/…

is_positive: bool

Whether to set negative numbers to 0

Returns
sf: np.ndarray [shape=(…, time)]

sf frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract sf feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> sf_arr = spectral_obj.sf(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(sf_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, sf_arr, axes=ax[1], label='sf')
../_images/spectral-17.png
mkl(m_data_arr, tp=0)

Compute the spectral mkl feature.

\(\qquad mkl(t)=\sum_{k=b_1}^{b_2} \log\left(1+ \cfrac {s_k(t)}{s_k(t-1)} \right)\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

tp: int, 0 or 1

0 sum 1 mean

Returns
mkl: np.ndarray [shape=(…, time)]

mkl frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract mkl feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> mkl_arr = spectral_obj.mkl(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(mkl_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, mkl_arr, axes=ax[1], label='mkl')
../_images/spectral-18.png
pd(m_data_arr, m_phase_arr)

Compute the spectral pd feature.

\(\qquad \psi_k(t)\) is set as the phase function of point k at time t.

\(\qquad \psi_k^{\prime}(t)=\psi_k(t)-\psi_k(t-1)\)

\(\qquad \psi_k^{\prime\prime}(t)=\psi_k^{\prime}(t)-\psi_k^{\prime}(t-1) = \psi_k(t)-2\psi_k(t-1)+\psi_k(t-2)\)

\(\qquad pd(t)= \frac {\sum_{k=b_1}^{b_2} \| \psi_k^{\prime\prime}(t) \|} {b_2-b_1+1}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

m_phase_arr: np.ndarray [shape=(…, fre, time)]

Phase data.

Returns
pd: np.ndarray [shape=(…, time)]

pd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract pd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> pd_arr = spectral_obj.pd(spec_arr, phase_arr)
wpd(m_data_arr, m_phase_arr)

Compute the spectral wpd feature.

\(\qquad \psi_k(t)\) is set as the phase function of point k at time t.

\(\qquad \psi_k^{\prime}(t)=\psi_k(t)-\psi_k(t-1)\)

\(\qquad \psi_k^{\prime\prime}(t)=\psi_k^{\prime}(t)-\psi_k^{\prime}(t-1) = \psi_k(t)-2\psi_k(t-1)+\psi_k(t-2)\)

\(\qquad wpd(t)= \frac {\sum_{k=b_1}^{b_2} \| \psi_k^{\prime\prime}(t) \|s_k(t)}{b_2-b_1+1}\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

m_phase_arr: np.ndarray [shape=(…, fre, time)]

Phase data.

Returns
wpd: np.ndarray [shape=(…, time)]

wpd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract wpd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> wpd_arr = spectral_obj.wpd(spec_arr, phase_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(wpd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, wpd_arr, axes=ax[1], label='wpd')
../_images/spectral-19.png
nwpd(m_data_arr, m_phase_arr)

Compute the spectral nwpd feature.

\(\qquad nwpd(t)= \frac {wpd} {\mu_s}\)

  • wpd: Spectral.wpd

  • \(\mu_s\): the mean of \(s_k(t)\)

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

m_phase_arr: np.ndarray [shape=(…, fre, time)]

Phase data.

Returns
nwpd: np.ndarray [shape=(…, time)]

nwpd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract nwpd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> nwpd_arr = spectral_obj.nwpd(spec_arr, phase_arr)
cd(m_data_arr, m_phase_arr)

Compute the spectral cd feature.

\(\qquad \psi_k(t)\) is set as the phase function of point k at time t.

\(\qquad \alpha_k(t)=s_k(t) e^{j(2\psi_k(t)-\psi_k(t-1))}\)

\(\qquad \beta_k(t)=s_k(t) e^{j\psi_k(t)}\)

\(\qquad cd(t)=\sum_{k=b_1}^{b_2} \| \beta_k(t)-\alpha_k(t-1) \|\)

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

m_phase_arr: np.ndarray [shape=(…, fre, time)]

Phase data.

Returns
cd: np.ndarray [shape=(…, time)]

cd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract cd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> cd_arr = spectral_obj.cd(spec_arr, phase_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(cd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, cd_arr, axes=ax[1], label='cd')
../_images/spectral-20.png
rcd(m_data_arr, m_phase_arr)

Compute the spectral rcd feature.

\(\qquad rcd(t)=cd\)

participate in the sum calculation when \(s_k(t) \geq s_k(t-1)\) is satisfied

  • cd: Spectral.cd

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

m_phase_arr: np.ndarray [shape=(…, fre, time)]

Phase data.

Returns
rcd: np.ndarray [shape=(…, time)]

rcd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract rcd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> rcd_arr = spectral_obj.rcd(spec_arr, phase_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(rcd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, rcd_arr, axes=ax[1], label='rcd')
../_images/spectral-21.png
broadband(m_data_arr, threshold=0)

Compute the spectral broadband feature.

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

threshold: float, [0,1]

broadband threshold

Returns
broadband: np.ndarray [shape=(…, time)]

broadband frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract broadband feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> broadband_arr = spectral_obj.broadband(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(broadband_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, broadband_arr, axes=ax[1], label='broadband')
../_images/spectral-22.png
novelty(m_data_arr, step=1, threshold=0.0, method_type=SpectralNoveltyMethodType.SUB, data_type=SpectralNoveltyDataType.VALUE)

Compute the spectral novelty feature.

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

step: int

Compute time axis steps, like 1/2/3/…

threshold: float [0,1]

Novelty threshold.

method_type: SpectralNoveltyMethodType

Novelty method type.

data_type: SpectralNoveltyDataType

Novelty data type.

Returns
novelty: np.ndarray [shape=(…, time)]

Novelty frequency per time step.

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract novelty feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> novelty_arr = spectral_obj.novelty(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(novelty_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, novelty_arr, axes=ax[1], label='novelty')
../_images/spectral-23.png
eef(m_data_arr, is_norm=False)

Compute the spectral eef feature.

\(\qquad p_k=\frac{s_k}{\sum_{k=b_1}^{b_2}s_k}\)

\(\qquad entropy2= {-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)}\)

\(\qquad eef=\sqrt{ 1+| energy\times entropy2| }\)

  • energy: Spectral.energy

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

is_norm: bool

Whether to norm

Returns
eef: np.ndarray [shape=(…, time)]

eef frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract eef feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> eef_arr = spectral_obj.eef(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(eef_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, eef_arr, axes=ax[1], label='eef')
../_images/spectral-24.png
eer(m_data_arr, is_norm=False, gamma=1.0)

Compute the spectral eer feature.

\(\qquad le=\log_{10}(1+\gamma \times energy), \gamma \in (0,\infty)\), represents log compression of data

\(\qquad p_k=\frac{s_k}{\sum_{k=b_1}^{b_2}s_k}\)

\(\qquad entropy2= {-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)}\)

\(\qquad eer=\sqrt{ 1+\left| \cfrac{le}{entropy2}\right| }\)

  • energy: Spectral.energy

  • \(b_1\) and \(b_2\): the frequency band bin boundaries

  • \(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

is_norm: bool

Whether to norm

gamma: float

Usually set is 1./10./20.etc, song is 0.5

Returns
eer: np.ndarray [shape=(…, time)]

eer frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract eer feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> eer_arr = spectral_obj.eer(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(eer_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, eer_arr, axes=ax[1], label='eer')
../_images/spectral-25.png
max(m_data_arr)

Compute the spectral max feature.

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
val_arr: np.ndarray [shape=(…, time)]

max value for each time

fre_arr: np.ndarray [shape=(…, time)]

max frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract max feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> max_val_arr, max_fre_arr = spectral_obj.max(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(max_val_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, max_val_arr, axes=ax[1], label='max_val')
../_images/spectral-26.png
mean(m_data_arr)

Compute the spectral mean feature.

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
val_arr: np.ndarray [shape=(…, time)]

mean value for each time

fre_arr: np.ndarray [shape=(…, time)]

mean frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract mean feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> mean_val_arr, mean_fre_arr = spectral_obj.mean(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(mean_val_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, mean_val_arr, axes=ax[1], label='mean_val')
../_images/spectral-27.png
var(m_data_arr)

Compute the spectral var feature.

Parameters
m_data_arr: np.ndarray [shape=(…, fre, time)]

Spectrogram data.

Returns
val_arr: np.ndarray [shape=(…, time)]

var value for each time

fre_arr: np.ndarray [shape=(…, time)]

var frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract var feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> var_val_arr, var_fre_arr = spectral_obj.var(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(var_val_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, var_val_arr, axes=ax[1], label='var_val')
../_images/spectral-28.png