SimpleCQT

class audioflux.SimpleCQT(num=84, samplate=32000, low_fre=32.70319566257483)

Simple CQT spectrogram class.

It can create simple CQT spectrogram, and only set a few basic parameters. If you want more parameter settings, use CQT class to create.

SimpleCQT class fixed parameter:
  • bin_per_octave: 12

  • low_fre: 32.703(C1)

  • factor: 1

  • beta: 0

  • thresh: 0.01

  • window_type: HANN

  • slide_length: fft_length / 4

  • normal_type: AREA

  • is_scale: 1

Parameters
num: int

Number of frequency bins to generate, starting at low_fre.

samplate: int

Sampling rate of the incoming audio.

low_fre: float

Lowest frequency. default: 32.703(C1)

Examples

Read 220Hz audio data

>>> import audioflux as af
>>> audio_path = af.utils.sample_path('220')
>>> audio_arr, sr = af.read(audio_path)
array([-5.5879354e-09, -9.3132257e-09,  0.0000000e+00, ...,
       3.2826858e-03,  3.2447521e-03,  3.0795704e-03], dtype=float32)

Create SimpleCQT object

>>> from audioflux.type import SpectralFilterBankNormalType
>>> from audioflux.utils import note_to_hz
>>> obj = af.SimpleCQT(num=84, samplate=sr, low_fre=note_to_hz('C1'))

Extract CQT spectrogram

>>> import numpy as np
>>> spec_arr = obj.cqt(audio_arr)
>>> spec_mag_arr = np.abs(spec_arr)
array([[4.13306177e-01, 4.24238801e-01, 4.03985798e-01, ...,
        9.09681246e-03, 7.72366347e-03, 6.51519699e-03],
       [3.45392436e-01, 3.47385347e-01, 3.18555593e-01, ...,
        3.74186062e-03, 3.57810827e-03, 5.09629818e-03],
       [2.74609476e-01, 2.69073159e-01, 2.38167673e-01, ...,
        1.73897278e-02, 1.37754036e-02, 1.05445199e-02],
       ...,
       [6.24795386e-04, 4.99437414e-02, 1.60291921e-02, ...,
        1.42244404e-04, 1.27240157e-04, 7.22698495e-03],
       [8.64346686e-04, 3.18217799e-02, 2.55288873e-02, ...,
        1.96875044e-04, 1.22444399e-04, 7.08540343e-03],
       [5.32156206e-04, 3.65380235e-02, 1.94845423e-02, ...,
        1.01396305e-04, 4.90328348e-05, 6.83017401e-03]], dtype=float32)

Show CQT spectrogram plot

>>> import matplotlib.pyplot as plt
>>> from audioflux.display import fill_spec
>>> audio_len = audio_arr.shape[0]
>>> fig, ax = plt.subplots()
>>> img = fill_spec(spec_mag_arr, axes=ax,
>>>           x_coords=obj.x_coords(audio_len),
>>>           y_coords=obj.y_coords(),
>>>           x_axis='time', y_axis='log',
>>>           title='CQT Spectrogram')
>>> fig.colorbar(img, ax=ax)

Extract Chroma-cqt data

>>> chroma_arr = obj.chroma(spec_arr, chroma_num=12)
array([[2.28795856e-01, 1.27659831e-02, 5.76458964e-03, ...,
        1.08233641e-03, 1.06033171e-03, 8.46851245e-02],
       ...,
       [3.88875484e-01, 4.20701923e-03, 1.00626342e-03, ...,
        7.09769898e-04, 3.13778268e-03, 2.85723597e-01]], dtype=float32)

Show Chroma-CQT spectrogram plot

>>> fig, ax = plt.subplots()
>>> img = fill_spec(chroma_arr, axes=ax,
>>>           x_coords=obj.x_coords(audio_len),
>>>           x_axis='time', y_axis='chroma',
>>>           title='Chroma-CQT Spectrogram')
>>> fig.colorbar(img, ax=ax)
../_images/simpleCqt-1_00.png
../_images/simpleCqt-1_01.png

Methods

cal_time_length(data_length)

Calculate the length of a frame from audio data.

chroma(m_cqt_data[, chroma_num, data_type, ...])

Calculate the chroma matrix of CQT

cqcc(m_data_arr, cc_num, rectify_type)

Compute the spectral cqcc feature.

cqhc(m_data_arr[, hc_num])

Compute the spectral cqhc feature.

cqt(data_arr)

Get spectrogram data

deconv(m_data_arr)

Compute the spectral deconv feature.

get_fft_length()

Get fft_length

get_fre_band_arr()

Get an array of frequency bands of CQT scales.

set_scale(flag)

Set scale

x_coords(data_length)

Get the X-axis coordinate

y_coords()

Get the Y-axis coordinate of CQT

cal_time_length(data_length)

Calculate the length of a frame from audio data.

(data_length - fft_length) / slide_length + 1

Parameters
data_length: int

The length of the data to be calculated.

Returns
out: int
get_fft_length()

Get fft_length

Returns
out: int
get_fre_band_arr()

Get an array of frequency bands of CQT scales.

Returns
out: np.ndarray [shape=(fre,)]
set_scale(flag)

Set scale

Parameters
flag: int
cqt(data_arr)

Get spectrogram data

Parameters
data_arr: np.ndarray [shape=(n)]

Input audio data

Returns
out: np.ndarray [shape=(fre, time)]
chroma(m_cqt_data, chroma_num=12, data_type=SpectralDataType.POWER, norm_type=ChromaDataNormalType.MAX)

Calculate the chroma matrix of CQT

Parameters
m_cqt_data: np.ndarray [shape=(fre, time), dtype=np.complex]

CQT spectrogram matrix, call the cqt method to get.

chroma_num: int

Number of chroma bins to produce

data_type: SpectralDataType

Data type of CQT spectrogram matrix

See: type.SpectralDataType

norm_type: ChromaDataNormalType

Normalization type of chroma

Returns
out: np.ndarray [shape=(chroma_num, time)]
cqcc(m_data_arr, cc_num, rectify_type) ndarray

Compute the spectral cqcc feature.

Parameters
m_data_arr: np.ndarray [shape=(fre, time)]

CQT spectrogram matrix, call the cqt method to get.

  • data_type:
    1. mag: np.abs(D)

    2. power: np.abs(D) ** 2

  • If data of np.complex type is passed in, the default is power

cc_num: int

Number of cc to produce

rectify_type: CepstralRectifyType

Rectify type

Returns
out: np.ndarray [shape=(cc_num, time)]
cqhc(m_data_arr, hc_num=20) ndarray

Compute the spectral cqhc feature.

Parameters
m_data_arr: np.ndarray [shape=(fre, time)]

CQT spectrogram matrix, call the cqt method to get.

  • data_type:
    1. mag: np.abs(D)

    2. power: np.abs(D) ** 2

  • If data of np.complex type is passed in, the default is power

hc_num: int

Number of hc to produce

Returns
out: np.ndarray [shape=(hc_num, time)]
deconv(m_data_arr) -> (<class 'numpy.ndarray'>, <class 'numpy.ndarray'>)

Compute the spectral deconv feature.

Parameters
m_data_arr: np.ndarray [shape=(fre, time)]

CQT spectrogram matrix, call the cqt method to get.

  • data_type:
    1. mag: np.abs(D)

    2. power: np.abs(D) ** 2

  • If data of np.complex type is passed in, the default is mag

Returns
m_tone_arr: np.ndarray [shape=(…, time)]

The matrix of tone

m_pitch_arr: np.ndarray [shape=(…, time)]

The matrix of pitch

y_coords()

Get the Y-axis coordinate of CQT

Returns
out: np.ndarray [shape=(fre,)]
x_coords(data_length)

Get the X-axis coordinate

Parameters
data_length: int

The length of the data to be calculated.

Returns
out: np.ndarray [shape=(time,)]