"Audio watermarking through parametric signal representations"

Yi-Wen Liu, Ph.D. candidate, Department of Electrical Engineering,

Stanford University

 

Abstract:

Since the technology was invented 160 years ago, usually, the term

"watermarking" refers to the embedding of special symbols in a piece of

paper so as to prove the authenticity of a document. However,

"watermarking" has acquired a broader meaning in recent years. In this era

when everything moves digital, a variety of objects can be marked if they

need to carry secret information. Such objects include audio signals,

images, video clips, computer programs, integrated circuits, and even

synthesized molecules.

In this talk, I will present an audio watermarking scheme based on fine

frequency modulation. In an audio signal that comes in to be marked,

salient sinusoids are first parametrized by slowly varying amplitude and

frequency envelopes. Then, whereas most other schemes manipulate the

amplitude envelopes, the proposed scheme modifies frequency envelopes

using a technique called "Quantization Index Modulation" (QIM), so as to

embed binary information. Frequency shifts due to QIM are intended to be

not objectionable, if noticeable. To this end, the talk shall review basic

facts in psycho- and physio- acoustics.

The watermark decoder estimates the frequencies of sinusoids based on

spectral interpolation, and then extracts the binary information based on

a maximum-likelihood (ML) method. The ML method involves optimal

combination of binary opinions, which is also known as a solution to "the

N-weatherperson problem".

The watermarking scheme demonstrates robustness against additive colored

noise and sustains format conversion, in particular lossy compression due

to perceptual audio coding. This shows the potential of the scheme for

existing applications such like audio content authentication.

Interestingly, the scheme could also be customized to help a computer

segregate sound sources from mono mixtures, which is an easy task for

human listeners but generally hard for machines.

This research is conducted at the Center for Computer Research in Music

and Acoustics (CCRMA) under the supervision of Prof. Julius Smith.

 



back