Click to See Complete Forum and Search --> : Audio Compression - DCT - MP3???


yiannakop
May 14th, 2004, 07:34 AM
Hi everyone. I'm trying to develop a program in C for data compression. I want to use it for audio data. Now, I use compression based on DCT and I achieve compression ratio of the level of 25-40 % and I want it to be less than 15%. I would like to now how mp3 encoders work. I mean the very simple and general principles of an mp3 coder (e.g. what kind of transformations are used).

Any suggestions will be appriciated.

Theodore.

proxima centaur
May 19th, 2004, 12:13 PM
have you tried looking at the wavelet transform?

yiannakop
May 20th, 2004, 05:38 AM
Thanx for your suggestion. Do you know if wavelets are have more "compact" energy (I mean that some coeficients have almost all energy), so that I can achieve high compression ratios? Also, I've heard that wavelets are used in image compression, not in audio because audio signals have inherent periodicity, which wavelets cannot handle really well.
Anyway, I was thinking to start working with psycoaccoustic models but I cannot find any sample-codes in the net. If anyone has anything please send.
Thanx,
Theodore

hspc
May 23rd, 2004, 05:07 PM
I just think this link can be useful
Data compression in multimedia (http://www.cis.udel.edu/~amer/CISC651/) :cool:

jham4
December 3rd, 2004, 02:41 AM
Actually I've been trying to do the same ting using FFT's.

Mp3's use 32 seperate frequencies, which aren't necessarily evenly seperated. This is due to the Human ear being more sensitive to frequency shifts in some bands compared to others. To eliminate redundant frequencies due to tone masking, a trick known as psyco-attenuation is implemented.

Try this link for help with psyco-attenuation:

http://is.rice.edu/~welsh/elec431/psychoAcoustic.html

I have a question myself.

Currently I use an FFT of 2048 samples (original signal is 44k- so I have 22hz sensitivity.) I convert to magnitudes, then perform the reverse transformation by placing each magnitude in the sin component (so that at each end of the time domain sample approaches zero, avoiding transients between samples.)

This works ok, but distortion is definately present (particularly voice.) Is there a better transform than FFT? I keep hearing that DCT's are the way to go, but so far they appear to be worse.

A little help anyone?

James

codebot
December 20th, 2004, 12:09 PM
Currently I use an FFT of 2048 samples (original signal is 44k- so I have 22hz sensitivity.) I convert to magnitudes, then perform the reverse transformation by placing each magnitude in the sin component (so that at each end of the time domain sample approaches zero, avoiding transients between samples.)


Why are you throwing away the phase information?

Usually with compression you would throw away the channels where the energy is small.

You will get distortion if you throw any data away, it is just a question of how much.

Regarding mp3, it works using subband coding. You split the bandwidth into different components and compress each in a way that is appropriate for that part of the signal. You can quantize each channel using different number of bits so that you introduce distortion in the bands that you don't care about.

mp3 uses multichannel encoding with 32 bands (not 32 frequencies).

yiannakop
December 20th, 2004, 01:26 PM
Why are you throwing away the phase information?

Usually with compression you would throw away the channels where the energy is small.

You will get distortion if you throw any data away, it is just a question of how much.

Regarding mp3, it works using subband coding. You split the bandwidth into different components and compress each in a way that is appropriate for that part of the signal. You can quantize each channel using different number of bits so that you introduce distortion in the bands that you don't care about.

mp3 uses multichannel encoding with 32 bands (not 32 frequencies).
Thanx alot for your replay. Phase information has not a very significant meaning in audio/speech signals as far as I know. Image is a signal that phase has a very significant part.