Experiment No. 3: The Fourier Transform - An Audio Signal Is Comprised of Several Single-Frequency Sound

Download as docx, pdf, or txt
Download as docx, pdf, or txt
You are on page 1of 7

Experiment no.

Student Name: Raghvendra Singh UID: 18BEC1080

Branch: ECE Section/Group: 2/B

Semester: 7th Date of Performance: 28/08/21

Subject Name: Audio & video Processing Subject Code: ECB-402

Aim/Objective of the Practical: To generate an audio signal and visual it’s Mel spectrogram

Signals

Apparatus/Amenities required: Python Jupyter Notebook software

Theory:- A signal is a variation in a certain quantity over time. For audio, the quantity that varies
is air pressure. How do we capture this information digitally? We can take samples of the air
pressure over time. The rate at which we sample the data can vary, but is most commonly
44.1kHz, or 44,100 samples per second. What we have captured is a waveform for the signal,
and this can be interpreted, modified, and analyzed with computer software.

The Fourier Transform - An audio signal is comprised of several single-frequency sound


waves. When taking samples of the signal over time, we only capture the resulting amplitudes.
The Fourier transform is a mathematical formula that allows us to decompose a signal into it’s
individual frequencies and the frequency’s amplitude. In other words, it converts the signal from
the time domain into the frequency domain. The result is called a spectrum.

The Spectrogram - The fast Fourier transform is a powerful tool that allows us to analyze the
frequency content of a signal, but what if our signal’s frequency content varies over time? Such
is the case with most audio signals such as music and speech. These signals are known as non
periodic signals. We need a way to represent the spectrum of these signals as they vary over
time. We compute several spectrums by performing FFT on several windowed segments of the
signal? This is exactly what is done, and it is called the short-time Fourier transform. The FFT is
computed on overlapping windowed segments of the signal, and we get what is called
the spectrogram. A good visual is in order.

The Mel Scale - Studies have shown that humans do not perceive frequencies on a linear scale.
We are better at detecting differences in lower frequencies than higher frequencies. For example,
we can easily tell the difference between 500 and 1000 Hz, but we will hardly be able to tell a
difference between 10,000 and 10,500 Hz, even though the distance between the two pairs are
the same. In 1937, Stevens, Volkmann, and Newmann proposed a unit of pitch such that equal
distances in pitch sounded equally distant to the listener. This is called the mel scale. We
perform a mathematical operation on frequencies to convert them to the mel scale.
Code:
Observations:

1. We observe how to read and write audio files & mel scale in Python.
2. Using different type of inbuilt audio statements help to create a simple and short program

or code.

3. It helps in processing audio files very easily.

4. The usage of functions in Python.

5. How to generate an audio signal and visual its Mel spectrogram Signals.

Fig-1

Fig-2
Fig- 3

Fig- 4

Result: We learned and implemented various functions in generate an audio signal and visual
its Melspectrogram Signals.

Learning outcomes:

1. About various audio processing functions and operations in Python Jupyter notebook.

2. How to implement these audio function statements in our programs.


3. Need of these statements.

4. Performed and run programs using inbuilt functions and statements.

Evaluation Grid (To be created as per the SOP and Assessment guide lines of practical):

Sr. No. Parameters Marks Obtained Maximum Marks

1.

2.

3.

You might also like