Environmental Noise Prediction and Psychoacoustics on Small to Large-Scale Systems

Engine Noise Signal Separation

Engine sound influenced by human voices.

Original Signals

 Engine sound mixtures (weak influence scenario)

 PCA as pre-processing.

ICA has two most popular variations: Infomax – Mutual Information Minimisation (MIM)  and Non-Gaussian Maximisation (NGM)

BSS has demonstrated a good estimation to the engine sound signal for fault diagnosis. MIM ICA has displayed an overall advantage by surpassing NGM ICA in both the time and frequency domains.

 Example: Spectrogram plot  and pYIN using Audacity

 Example: Waveform, peak detector and peak/spectrogram plot using Sonic Visualiser

 Hard Disk Drive(HDD) Noise Analysis

The screeching sound exhibits higher magnitude as seen in the (lighter colour) spectrogram-STFT

The energy is higher or louder at higher frequency than its steady-state at lower frequency.

Performed analysis at t=1 sec, window type=Blackman, window size (M)=1024, FFT size (N)=1024. Sound resampled at 44100Hz mono. No harmonics detected

The fundamental frequency ~ 590Hz.

Phase maintain until frequency > 900Hz and 8700Hz





High energy at low frequency except during t=0.06sec, t=1.35sec, 2.54 sec and 4.27 sec.

Generally, the energy at high frequency decays as time increases. We can increase the FFT size to improve the frequency resolution.

Quite a good resolution in time but poor in frequency.

Spectrogram phase exhibits less repeatable structure. Magnitude has some increase in some area but not much repeatable structure. Quite noisy signal and random in phase. It is quite stochastic.

No harmonic and phase disruption/discontinuity as seen in the phase spectrogram (derivative).

Sinusoidal Detection

Frequencies of sinusoidal tracks detected. Quite unstable i.e. discontinuity on sinusoidal tracks. The sinusoidal track is not very clear due to the lost during transitions/gaps.   

Pitch is the quality that allows us to classify a sound as relatively high or low. Pitch is determined by the frequency of sound wave vibrations.

Estimate the pitch salience over time. See the frequency values at the bottom of the spectrogram.

Pitch contours generated by the salience function– candidate of fundamental frequency.

 Final set of pitch contours from which the melody obtained


It is stochastic approximation. Frequency and time resolution are good. Synthesised sound is quite close to the original sound.

Harmonic and Residual

No harmonics found. The background spectrogram is residual. Hence, poor estimation using harmonic and residual.

Harmonic with stochastic

No harmonic. Background sound is stochastic.

Sound Transformation - Morphing with STFT

HDD Load/Unload STFT

 Male Speech STFT

Morphing with STFT. Using smooth factor of 0.5(where 1 is no smoothing). Balance factor of 0.6 (0 to HDD Load/Unload, 1 to Male Speech).

Erstellen Sie eine kostenlose Website mit Yola.