Spectral

class audioflux.Spectral(num, fre_band_arr)

Spectrum feature, supports all spectrum types.

Parameters

num: int: Number of frequency bins to generate. It must be the same as the num parameter of the transformation (same as the spectrogram matrix).
fre_band_arr: np.ndarray [shape=(n_fre,)]: The array of frequency bands. Obtained by calling the get_fre_band_arr() method of the transformation.

Methods

`band_width`(m_data_arr[, p])	Compute the spectral band_width feature.
`broadband`(m_data_arr[, threshold])	Compute the spectral broadband feature.
`cd`(m_data_arr, m_phase_arr)	Compute the spectral cd feature.
`centroid`(m_data_arr)	Compute the spectral centroid feature.
`crest`(m_data_arr)	Compute the spectral crest feature.
`decrease`(m_data_arr)	Compute the spectral decrease feature.
`eef`(m_data_arr[, is_norm])	Compute the spectral eef feature.
`eer`(m_data_arr[, is_norm, gamma])	Compute the spectral eer feature.
`energy`(m_data_arr[, is_log, gamma])	Compute the spectral energy feature.
`entropy`(m_data_arr[, is_norm])	Compute the spectral entropy feature.
`flatness`(m_data_arr)	Compute the spectral flatness feature.
`flux`(m_data_arr[, step, p, is_positive, ...])	Compute the spectral flux feature.
`hfc`(m_data_arr)	Compute the spectral hfc feature.
`kurtosis`(m_data_arr)	Compute the spectral kurtosis feature.
`max`(m_data_arr)	Compute the spectral max feature.
`mean`(m_data_arr)	Compute the spectral mean feature.
`mkl`(m_data_arr[, tp])	Compute the spectral mkl feature.
`novelty`(m_data_arr[, step, threshold, ...])	Compute the spectral novelty feature.
`nwpd`(m_data_arr, m_phase_arr)	Compute the spectral nwpd feature.
`pd`(m_data_arr, m_phase_arr)	Compute the spectral pd feature.
`rcd`(m_data_arr, m_phase_arr)	Compute the spectral rcd feature.
`rms`(m_data_arr)	Compute the spectral rms feature.
`rolloff`(m_data_arr[, threshold])	Compute the spectral rolloff feature.
`sd`(m_data_arr[, step, is_positive])	Compute the spectral sd feature.
`set_edge`(start, end)	Set edge
`set_edge_arr`(index_arr)	Set edge array
`set_time_length`(time_length)	Set time length
`sf`(m_data_arr[, step, is_positive])	Compute the spectral sf feature.
`skewness`(m_data_arr)	Compute the spectral skewness feature.
`slope`(m_data_arr)	Compute the spectral slope feature.
`spread`(m_data_arr)	Compute the spectral spread feature.
`var`(m_data_arr)	Compute the spectral var feature.
`wpd`(m_data_arr, m_phase_arr)	Compute the spectral wpd feature.

set_time_length(time_length)

Set time length

Parameters

time_length: int

set_edge(start, end)

Set edge

Parameters

start: int: 0 ~ end
end: int: start ~ num-1

set_edge_arr(index_arr)

Set edge array

Parameters

index_arr: np.ndarray [shape=(n,), dtype=np.int32]: fre index array

flatness(m_data_arr)

Compute the spectral flatness feature.

\(\qquad flatness=\frac{\left ( \prod_{k=b_1}^{b_2} s_k \right)^{ \frac{1}{b_2-b_1} } } {\frac{1}{b_2-b_1} \sum_{ k=b_1 }^{b_2} s_k}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

flatness: np.ndarray [shape=(…, time)]: flatness frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract flatness feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> flatness_arr = spectral_obj.flatness(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(flatness_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, flatness_arr, axes=ax[1], label='flatness')

flux(m_data_arr, step=1, p=2, is_positive=False, is_exp=False, tp=0)

Compute the spectral flux feature.

\(\qquad flux(t)=\left( \sum_{k=b_1}^{b_2} |s_k(t)-s_k(t-1) |^{p} \right)^{\frac{1}{p}}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
In general \(s_k(t) \geq s_k(t-1)\) participate in the calculation

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
step: int: Compute time axis steps, like 1/2/3/…
p: int, 1 or 2: norm: 1 abs; 2 pow
is_positive: bool: Whether to set negative numbers to 0
is_exp: bool: Whether to exp
tp: int, 0 or 1: 0 sum 1 mean

Returns

flux: np.ndarray [shape=(…, time)]: flux frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract flux feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> flux_arr = spectral_obj.flux(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(flux_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, flux_arr, axes=ax[1], label='flux')

rolloff(m_data_arr, threshold=0.95)

Compute the spectral rolloff feature.

\(\qquad \sum_{k=b_1}^{i}|s_k| \geq \eta \sum_{k=b_1}^{b_2}s_k\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
\(\eta \in (0,1)\), generally take 0.95 or 0.85, satisfy the condition \(i\) get \(f_i\) rolloff frequency

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
threshold: float, [0,1]: rolloff threshold. Generally take 0.95 or 0.85.

Returns

rolloff: np.ndarray [shape=(…, time)]: rolloff frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract rolloff feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> rolloff_arr = spectral_obj.rolloff(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(rolloff_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, rolloff_arr, axes=ax[1], label='rolloff')

centroid(m_data_arr)

Compute the spectral centroid feature.

\(\qquad \mu_1=\frac{\sum_{ k=b_1 }^{b_2} f_ks_k } {\sum_{k=b_1}^{b_2} s_k }\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(f_k\) is in Hz
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

centroid: np.ndarray [shape=(…, time)]: centroid frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract centroid feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> centroid_arr = spectral_obj.centroid(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(centroid_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, centroid_arr, axes=ax[1], label='centroid')

spread(m_data_arr)

Compute the spectral spread feature.

\(\qquad \mu_2=\sqrt{\frac{\sum_{ k=b_1 }^{b_2} (f_k-\mu_1)^2 s_k } {\sum_{k=b_1}^{b_2} s_k } }\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(f_k\) is in Hz
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
\(u_1\): Spectral.centroid

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

spread: np.ndarray [shape=(…, time)]: spread frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract spread feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> spread_arr = spectral_obj.spread(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(spread_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, spread_arr, axes=ax[1], label='spread')

skewness(m_data_arr)

Compute the spectral skewness feature.

\(\qquad \mu_3=\frac{\sum_{ k=b_1 }^{b_2} (f_k-\mu_1)^3 s_k } {(\mu_2)^3 \sum_{k=b_1}^{b_2} s_k }\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(f_k\) is in Hz
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
\(u_1\): Spectral.centroid
\(u_2\): Spectral.spread

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

skewness: np.ndarray [shape=(…, time)]: skewness frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract skewness feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> skewness_arr = spectral_obj.skewness(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(skewness_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, skewness_arr, axes=ax[1], label='skewness')

kurtosis(m_data_arr)

Compute the spectral kurtosis feature.

\(\qquad \mu_4=\frac{\sum_{ k=b_1 }^{b_2} (f_k-\mu_1)^4 s_k } {(\mu_2)^4 \sum_{k=b_1}^{b_2} s_k }\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(f_k\) is in Hz
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
\(u_1\): Spectral.centroid
\(u_2\): Spectral.spread

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

kurtosis: np.ndarray [shape=(…, time)]: kurtosis frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract kurtosis feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> kurtosis_arr = spectral_obj.kurtosis(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(kurtosis_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, kurtosis_arr, axes=ax[1], label='kurtosis')

entropy(m_data_arr, is_norm=False)

Compute the spectral entropy feature.

Set: \(p_k=\frac{s_k}{\sum_{k=b_1}^{b_2}s_k}\)

\(\qquad entropy1= \frac{-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)} {\log(b_2-b_1)}\)

Or

\(\qquad entropy2= {-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
is_norm: bool: Whether to norm

Returns

entropy: np.ndarray [shape=(…, time)]: entropy frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract entropy feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> entropy_arr = spectral_obj.entropy(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(entropy_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, entropy_arr, axes=ax[1], label='entropy')

crest(m_data_arr)

Compute the spectral crest feature.

\(\qquad crest =\frac{max(s_{k\in_{[b_1,b_2]} }) } {\frac{1}{b_2-b_1} \sum_{ k=b_1 }^{b_2} s_k}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

crest: np.ndarray [shape=(…, time)]: crest frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract crest feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> crest_arr = spectral_obj.crest(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(crest_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, crest_arr, axes=ax[1], label='crest')

slope(m_data_arr)

Compute the spectral slope feature.

\(\qquad slope=\frac{ \sum_{k=b_1}^{b_2}(f_k-\mu_f)(s_k-\mu_s) } { \sum_{k=b_1}^{b_2}(f_k-\mu_f)^2 }\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(f_k\) is in Hz
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
\(\mu_f\): average frequency value
\(\mu_s\): average spectral value

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

slope: np.ndarray [shape=(…, time)]: slope frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract slope feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> slope_arr = spectral_obj.slope(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(slope_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, slope_arr, axes=ax[1], label='slope')

decrease(m_data_arr)

Compute the spectral decrease feature.

\(\qquad decrease=\frac { \sum_{k=b_1+1}^{b_2} \frac {s_k-s_{b_1}}{k-1} } { \sum_{k=b_1+1}^{b_2} s_k }\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

decrease: np.ndarray [shape=(…, time)]: decrease frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract decrease feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> decrease_arr = spectral_obj.decrease(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(decrease_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, decrease_arr, axes=ax[1], label='decrease')

band_width(m_data_arr, p=2)

Compute the spectral band_width feature.

\(\qquad bandwidth=\left(\sum_{k=b_1}^{b_2} s_k(f_k-centroid)^p \right)^{\frac{1}{p}}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(f_k\) is in Hz
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
centroid: Spectral.centroid

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
p: int, 1 or 2: norm: 1 abs; 2 pow

Returns

band_width: np.ndarray [shape=(…, time)]: band_width frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract band_width feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> band_width_arr = spectral_obj.band_width(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(band_width_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, band_width_arr, axes=ax[1], label='band_width')

rms(m_data_arr)

Compute the spectral rms feature.

\(\qquad rms=\sqrt{ \frac{1}{N} \sum_{n=1}^N x^2[n] }=\sqrt {\frac{1}{N^2}\sum_{m=1}^N |X[m]|^2 }\)

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

rms: np.ndarray [shape=(…. time)]: rms frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract rms feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> rms_arr = spectral_obj.rms(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(rms_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, rms_arr, axes=ax[1], label='rms')

energy(m_data_arr, is_log=False, gamma=10.0)

Compute the spectral energy feature.

\(\qquad energy=\sum_{n=1}^N x^2[n] =\frac{1}{N}\sum_{m=1}^N |X[m]|^2\)

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
is_log: bool: Whether to log
gamma: float: energy gamma value.

Returns

energy: np.ndarray [shape=(…, time)]: energy frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract energy feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> energy_arr = spectral_obj.energy(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(energy_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, energy_arr, axes=ax[1], label='energy')

hfc(m_data_arr)

Compute the spectral hfc feature.

\(\qquad hfc(t)=\frac{\sum_{k=b_1}^{b_2} s_k(t)k }{b_2-b_1+1}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

hfc: np.ndarray [shape=(…, time)]: hfc frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract hfc feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> hfc_arr = spectral_obj.hfc(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(hfc_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, hfc_arr, axes=ax[1], label='hfc')

sd(m_data_arr, step=1, is_positive=False)

Compute the spectral sd feature.

\(\qquad sd(t)=flux(t)\)

satisfies the calculation of \(s_k(t) \ge s_k(t-1)\), \(p=2\)，the result is not \(1/p\)

\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
flux: Spectral.flux

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
step: int: Compute time axis steps, like 1/2/3/…
is_positive: bool: Whether to set negative numbers to 0

Returns

sd: np.ndarray [shape=(…, time)]: sd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract sd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> sd_arr = spectral_obj.sd(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(sd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, sd_arr, axes=ax[1], label='sd')

sf(m_data_arr, step=1, is_positive=False)

Compute the spectral sf feature.

\(\qquad sf(t)=flux(t)\)

satisfies the calculation of \(s_k(t) \ge s_k(t-1)\), \(p=1\)

\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum
flux: Spectral.flux

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
step: int: Compute time axis steps, like 1/2/3/…
is_positive: bool: Whether to set negative numbers to 0

Returns

sf: np.ndarray [shape=(…, time)]: sf frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract sf feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> sf_arr = spectral_obj.sf(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(sf_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, sf_arr, axes=ax[1], label='sf')

mkl(m_data_arr, tp=0)

Compute the spectral mkl feature.

\(\qquad mkl(t)=\sum_{k=b_1}^{b_2} \log\left(1+ \cfrac {s_k(t)}{s_k(t-1)} \right)\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
tp: int, 0 or 1: 0 sum 1 mean

Returns

mkl: np.ndarray [shape=(…, time)]: mkl frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract mkl feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> mkl_arr = spectral_obj.mkl(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(mkl_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, mkl_arr, axes=ax[1], label='mkl')

pd(m_data_arr, m_phase_arr)

Compute the spectral pd feature.

\(\qquad \psi_k(t)\) is set as the phase function of point k at time t.

\(\qquad \psi_k^{\prime}(t)=\psi_k(t)-\psi_k(t-1)\)

\(\qquad \psi_k^{\prime\prime}(t)=\psi_k^{\prime}(t)-\psi_k^{\prime}(t-1) = \psi_k(t)-2\psi_k(t-1)+\psi_k(t-2)\)

\(\qquad pd(t)= \frac {\sum_{k=b_1}^{b_2} \| \psi_k^{\prime\prime}(t) \|} {b_2-b_1+1}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
m_phase_arr: np.ndarray [shape=(…, fre, time)]: Phase data.

Returns

pd: np.ndarray [shape=(…, time)]: pd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract pd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> pd_arr = spectral_obj.pd(spec_arr, phase_arr)

wpd(m_data_arr, m_phase_arr)

Compute the spectral wpd feature.

\(\qquad \psi_k(t)\) is set as the phase function of point k at time t.

\(\qquad \psi_k^{\prime}(t)=\psi_k(t)-\psi_k(t-1)\)

\(\qquad \psi_k^{\prime\prime}(t)=\psi_k^{\prime}(t)-\psi_k^{\prime}(t-1) = \psi_k(t)-2\psi_k(t-1)+\psi_k(t-2)\)

\(\qquad wpd(t)= \frac {\sum_{k=b_1}^{b_2} \| \psi_k^{\prime\prime}(t) \|s_k(t)}{b_2-b_1+1}\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
m_phase_arr: np.ndarray [shape=(…, fre, time)]: Phase data.

Returns

wpd: np.ndarray [shape=(…, time)]: wpd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract wpd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> wpd_arr = spectral_obj.wpd(spec_arr, phase_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(wpd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, wpd_arr, axes=ax[1], label='wpd')

nwpd(m_data_arr, m_phase_arr)

Compute the spectral nwpd feature.

\(\qquad nwpd(t)= \frac {wpd} {\mu_s}\)

wpd: Spectral.wpd
\(\mu_s\): the mean of \(s_k(t)\)
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
m_phase_arr: np.ndarray [shape=(…, fre, time)]: Phase data.

Returns

nwpd: np.ndarray [shape=(…, time)]: nwpd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract nwpd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> nwpd_arr = spectral_obj.nwpd(spec_arr, phase_arr)

cd(m_data_arr, m_phase_arr)

Compute the spectral cd feature.

\(\qquad \psi_k(t)\) is set as the phase function of point k at time t.

\(\qquad \alpha_k(t)=s_k(t) e^{j(2\psi_k(t)-\psi_k(t-1))}\)

\(\qquad \beta_k(t)=s_k(t) e^{j\psi_k(t)}\)

\(\qquad cd(t)=\sum_{k=b_1}^{b_2} \| \beta_k(t)-\alpha_k(t-1) \|\)

\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
m_phase_arr: np.ndarray [shape=(…, fre, time)]: Phase data.

Returns

cd: np.ndarray [shape=(…, time)]: cd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract cd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> cd_arr = spectral_obj.cd(spec_arr, phase_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(cd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, cd_arr, axes=ax[1], label='cd')

rcd(m_data_arr, m_phase_arr)

Compute the spectral rcd feature.

\(\qquad rcd(t)=cd\)

participate in the sum calculation when \(s_k(t) \geq s_k(t-1)\) is satisfied

cd: Spectral.cd
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
m_phase_arr: np.ndarray [shape=(…, fre, time)]: Phase data.

Returns

rcd: np.ndarray [shape=(…, time)]: rcd frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> phase_arr = af.utils.get_phase(spec_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract rcd feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> rcd_arr = spectral_obj.rcd(spec_arr, phase_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(rcd_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, rcd_arr, axes=ax[1], label='rcd')

broadband(m_data_arr, threshold=0)

Compute the spectral broadband feature.

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
threshold: float, [0,1]: broadband threshold

Returns

broadband: np.ndarray [shape=(…, time)]: broadband frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract broadband feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> broadband_arr = spectral_obj.broadband(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(broadband_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, broadband_arr, axes=ax[1], label='broadband')

novelty(m_data_arr, step=1, threshold=0.0, method_type=SpectralNoveltyMethodType.SUB, data_type=SpectralNoveltyDataType.VALUE)

Compute the spectral novelty feature.

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
step: int: Compute time axis steps, like 1/2/3/…
threshold: float [0,1]: Novelty threshold.
method_type: SpectralNoveltyMethodType: Novelty method type.
data_type: SpectralNoveltyDataType: Novelty data type.

Returns

novelty: np.ndarray [shape=(…, time)]: Novelty frequency per time step.

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract novelty feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> novelty_arr = spectral_obj.novelty(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(novelty_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, novelty_arr, axes=ax[1], label='novelty')

eef(m_data_arr, is_norm=False)

Compute the spectral eef feature.

\(\qquad p_k=\frac{s_k}{\sum_{k=b_1}^{b_2}s_k}\)

\(\qquad entropy2= {-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)}\)

\(\qquad eef=\sqrt{ 1+| energy\times entropy2| }\)

energy: Spectral.energy
\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
is_norm: bool: Whether to norm

Returns

eef: np.ndarray [shape=(…, time)]: eef frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract eef feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> eef_arr = spectral_obj.eef(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(eef_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, eef_arr, axes=ax[1], label='eef')

eer(m_data_arr, is_norm=False, gamma=1.0)

Compute the spectral eer feature.

\(\qquad le=\log_{10}(1+\gamma \times energy), \gamma \in (0,\infty)\), represents log compression of data

\(\qquad p_k=\frac{s_k}{\sum_{k=b_1}^{b_2}s_k}\)

\(\qquad entropy2= {-\sum_{ k=b_1 }^{b_2} p_k \log(p_k)}\)

\(\qquad eer=\sqrt{ 1+\left| \cfrac{le}{entropy2}\right| }\)

energy: Spectral.energy
\(b_1\) and \(b_2\): the frequency band bin boundaries
\(s_k\): the spectrum value, which can be magnitude spectrum or power spectrum

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.
is_norm: bool: Whether to norm
gamma: float: Usually set is 1./10./20.etc, song is 0.5

Returns

eer: np.ndarray [shape=(…, time)]: eer frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract eer feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> eer_arr = spectral_obj.eer(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(eer_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, eer_arr, axes=ax[1], label='eer')

max(m_data_arr)

Compute the spectral max feature.

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

val_arr: np.ndarray [shape=(…, time)]: max value for each time
fre_arr: np.ndarray [shape=(…, time)]: max frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract max feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> max_val_arr, max_fre_arr = spectral_obj.max(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(max_val_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, max_val_arr, axes=ax[1], label='max_val')

mean(m_data_arr)

Compute the spectral mean feature.

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

val_arr: np.ndarray [shape=(…, time)]: mean value for each time
fre_arr: np.ndarray [shape=(…, time)]: mean frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract mean feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> mean_val_arr, mean_fre_arr = spectral_obj.mean(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(mean_val_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, mean_val_arr, axes=ax[1], label='mean_val')

var(m_data_arr)

Compute the spectral var feature.

Parameters

m_data_arr: np.ndarray [shape=(…, fre, time)]: Spectrogram data.

Returns

val_arr: np.ndarray [shape=(…, time)]: var value for each time
fre_arr: np.ndarray [shape=(…, time)]: var frequency for each time

Examples

Read chord audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('guitar_chord1')
>>> audio_arr, sr = af.read(audio_path)

Create BFT-Linear object and extract spectrogram

>>> from audioflux.type import SpectralFilterBankScaleType, SpectralDataType
>>> import numpy as np
>>> bft_obj = af.BFT(num=2049, samplate=sr, radix2_exp=12, slide_length=1024,
>>>                data_type=SpectralDataType.MAG,
>>>                scale_type=SpectralFilterBankScaleType.LINEAR)
>>> spec_arr = bft_obj.bft(audio_arr)
>>> spec_arr = np.abs(spec_arr)

Create Spectral object and extract var feature

>>> spectral_obj = af.Spectral(num=bft_obj.num,
>>>                            fre_band_arr=bft_obj.get_fre_band_arr())
>>> n_time = spec_arr.shape[-1]  # Or use bft_obj.cal_time_length(audio_arr.shape[-1])
>>> spectral_obj.set_time_length(n_time)
>>> var_val_arr, var_fre_arr = spectral_obj.var(spec_arr)

Display plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_plot, fill_wave
>>> fig, ax = plt.subplots(nrows=2, sharex=True)
>>> fill_wave(audio_arr, samplate=sr, axes=ax[0])
>>> times = np.arange(0, len(var_val_arr)) * (bft_obj.slide_length / bft_obj.samplate)
>>> fill_plot(times, var_val_arr, axes=ax[1], label='var_val')