Digital Watermarking at glance

Prepared by Antonius Cahya Prihandoko

mc-watermarkDigital watermark is generally a code that is physically or non physically embedded inside host content and acts as a digital signature which provides information of ownership or authenticity of the content. In a typical content screening system, the client’s equipment searches a hidden information inside the content, in order to verify whether a valid license is present. The embedded watermark can also later be extracted or detected to make an assertion about the content for security purposes.

Digital watermarking can be done in two domains: spatial and frequency domain. In spatial domain, watermarking process is relatively easy: watermark signal is directly embedded into host content signal. However, weaknesses of almost all spatial techniques include altering the host content during embedding proses, having the lowest bit capacity and the lowest resistance to signal compression. Frequency domain watermarking is more robust and compatible to common signal compression standards. It is undertaken by first applying a transform, such as fast Fourier transform (FFT), discrete Fourier transform (DFT), discrete cosine transform (DCT), discrete wavelet transform (DWT), and others. Watermarking process is then carried out by altering the frequency value in the host content spectrum. Though having different robustness, the spatial and frequency domain watermarking result the same appearance.

Based on watermark verification mechanism, watermarking schemes can be classified into three categories. Firstly, non-blind scheme or private watermarking. This scheme require both the original host content and the secret key(s) for extracting or detecting the watermark. Secondly, semi-blind scheme or semi-private watermarking, which requires the secret key(s) and the watermark bit sequence. Finally, blind scheme or public watermarking remains the most challenging problem as it only needs the secret key in its watermark detecting stage.

In the electronic commerce applications, digital contents can be categorized into four basic types of media data: audio, video, text, and image. Each type requires a specific watermark according to watermark design patterns (WDPs)   Watermark for audio data has to be inaudible and robust. Watermark for video data has to be invisible and robust. Watermark for text data has to be invisible and fragile. Watermark for image data can be either visible and fragile, or invisible and robust. These watermark specifications are provided to meet customers satisfaction and content security.

In the term of rights management, digital watermark is applied to accomplish copyright protection. Initially, watermark can be used to prove the ownership of a watermarked content. The capability of watermark is then extended for content encryption to provide a secure content delivery. The watermarking application enables content provider to protect their intellectual property by integrating essential information of content in a hidden mark. For this purpose, watermark needs to be secure and robust; hard to remove and resists non malicious distortions.

Posted in Cryptography, Watermarking | Tagged , , , , , , , | Leave a comment

Cryptography at Glance

mc-cryptography

Cryptography  is the practice and study of techniques for secure communication in the presence of third parties (called adversaries). More generally, it is about constructing and analyzing protocols that overcome the influence of adversaries and which are related to various aspects in information security such as data confidentiality, data integrity, authentication, and non-repudiation. Modern cryptography intersects the disciplines of mathematics, computer science, and electrical engineering. Applications of cryptography include ATM cards, computer passwords, and electronic commerce.

Cryptography prior to the modern age was effectively synonymous with encryption, the conversion of information from a readable state to apparent nonsense. The originator of an encrypted message shared the decoding technique needed to recover the original information only with intended recipients, thereby precluding unwanted persons to do the same. Since World War I and the advent of the computer, the methods used to carry out cryptology have become increasingly complex and its application more widespread.

Modern cryptography is heavily based on mathematical theory and computer science practice; cryptographic algorithms are designed around computational hardness assumptions, making such algorithms hard to break in practice by any adversary. It is theoretically possible to break such a system but it is infeasible to do so by any known practical means. These schemes are therefore termed computationally secure; theoretical advances (e.g., improvements in integer factorization algorithms) and faster computing technology require these solutions to be continually adapted. There exist information-theoretically secure schemes that provably cannot be broken even with unlimited computing power—an example is the one-time pad—but these schemes are more difficult to implement than the best theoretically breakable but computationally secure mechanisms.

In typical Digital Rights Management (DRM) implementations, cryptography is a part of the security solution that employs a known strong algorithm and relies on the secrecy of the cryptographic key.  Cryptographic techniques are employed for securing content distribution using both encryption and signature mechanisms. Encryption is the mechanism of transforming content using a secret key so that the encrypted content cannot be accessed or used unless it is decrypted back into the accessible form. Signature mechanism is a process to attach associated data to a content for its authentication purposes.

Encryption Algorithms

Encryption algorithms are categorized into two types: symmetric and asymmetric. Symmetric algorithm (a.k.a private key cryptography) uses a single secret key for both encryption and decryption processes. For example, let  A wants to deliver content, x, to B A   firstly encrypts x using a secret key k,

y=e_{k}(x),

and sends the encrypted content, y, to B. When receiving y, B has to decrypt it using the same key k to access x. The secret key k must be known only by A and B. Well known examples for symmetric algorithm are the Data Encryption Standard (DES) and the Advance Encryption Standard (AES). The DES relies on 16 rounds of permutations, while the AES is based on byte substitution, permutation and XOR operation, which makes this algorithm easy to implement in hardware. One big issue in symmetric algorithm is how B can send the secret key k to B in a save way. Once an unauthorized party intercepts the key then everything will collapse.

The asymmetric algorithm (a.k.a public key cryptography) can overcome the symmetric algorithm’s key exchange problem. This algorithm employs two different keys for encrypting and decrypting content, named public and private key, respectively. The public key can be freely distributed or placed in an accessible directory, while the private one has to be kept secret. When A wants to send content, x, to B, for example, A encrypts x using B‘s public key, K_{B}^{Pu},

y=K_{B}^{Pu}(x).

Thus, B is the only party who able to decrypt y, using his own private key, K_{B}^{Pr}. The private and public keys are mathematically linked, however, in the real application, they are big numbers such that it is significantly hard to derive the private key from the public key. The security of public key cryptography, such as RSA, El-Gamal and Rabin algorithms, mostly rely on the hardness of exponential problem in a finite field.

Although the asymmetric algorithm can solve the issues of communicating the secret key, it does not mean that this algorithm is better than the symmetric one. In some application that does not involve more than one party, such as storing content in a database or protecting data inside the content, the symmetric algorithm is more efficient than the asymmetric one in term of complexity. In addition, since asymmetric algorithms are significantly slower than symmetric algorithms (for example, hardware and software implementations of RSA are about 1000 and 100, respectively, times slower than DES), they are less appropriate for encrypting and decrypting large amounts of data in short time than the symmetric algorithms. Thus, it will be suitable to combine them
in a hybrid system, whereby a symmetric algorithm, such as DES or AES, is used to encrypt and decrypt messages, and asymmetric algorithm is used to communicate the key. Furthermore, security of the asymmetric algorithm also depends on the validity of user’s public key. Recall the previous example, how it can be sure that K_{B}^{Pu} is indeed B‘s public key. It is possible that a third party has impersonated B and put its key in an accessible directory as B‘s public key. Therefore, the distributed public key has to be certified.

Key Validation

Public key infrastructure (PKI) is initiated to overcome the issue of public key validity in the asymmetric algorithm by supporting the management of keys and certificates.  A PKI framework consists of policies  and procedures for generating and publishing keys and certificates. Besides the policies and procedures, the essential components of a PKI include certificate authority (CA), registration authority (RA), certificate storage, and software . The CA is the trusted entity that generates the certificates: ensuring the correct key and identity is tided to the certificate. The RA is responsible to legitimate the users who receive the certificate. The functionality of the CA and RA can be handled by a single entity. The certificates as well as its updated information are saved in a CA’s database.

Typically, a digital certificate consists of certificate holder’s public key, information of holder, CA and Ca’s signature, issuing date and expire date, and  certificate serial number. The keys can be either generated by the CA for the client, or be generated by the client and certified by the CA. The signature is generated from the CA’s secret authentication key and the parameter held within the CA’s own authentication certificate. Addionally, in a hierarchy of CA’s, a CA is certified by a higher level of CA, while the root CA generates and signs its own authentication certificate.

Digital Signature

A digital signature scheme consists of two algorithms: signing and verifying algorithm, and employs two different key for these algorithms. However, unlike encryption mechanism, a signature scheme uses private key for signing a message and public key for verifying the signature. For example, before sending a message, x, to B, A signs it using a secret key. Subsequently, B can verify the signature using A‘s public key.

A formal definition of digital signature can be described below.

A digital signature scheme includes five sets \mathcal{P}, \mathcal{A}, \mathcal{K}, \mathcal{S}, and \mathcal{V}, which are finite set of possible messages, possible signatures, possible keys, signing algorithms, and verification algorithms, respectively. For each K \in \mathcal{K}, there is a signing algorithm sig_{k} \in \mathcal{S} and a corresponding verification algorithm ver_{K} in V. Each sig_{K}: \mathcal{P} \rightarrow \mathcal{A} and ver_{K}: \mathcal{P} \times \mathcal{A} \rightarrow \{true,false\} are functions such that the following equation is satisfied for every message x \in \mathcal{P} and for every signature y \in \mathcal{A}ver(x,y) = TRUE if and only if  y = sig(x)

The sig_{K} is secret while ver_{K} is public, however both of them should be polynomial-time functions and computationally infeasible for unauthorized person, say C,  to counterfeit  A‘s signature on message x. Since the message x belongs to A, hence he must be the only person who is able to compute the signature y=sig_{K}(x) such that ver_{K}(x,y)=true. Otherwise, his signature can be forged.

A digital signature scheme cannot be unconditionally secure due to the finitely of the set of signatures, \mathcal{A}, involved in it. For A‘s message, x, C can apply all possible signatures using the public algorithm  ver until finding the right signature. Thus, given adequate time, C can always counterfeit A‘s signature. Therefore, it is desirable to have signature scheme that is computationally secure.

In practice, signing and verifying algorithm involve multiplication or exponential based computation. As a result, the signature is at least as long as the message. This will be inefficient if the message is long. To overcome this problem, a hash function is applied. With a message, x, of arbitrary length as input, a hash function, h, produces a message digest, h(x), of fixed length. Hash function has following characteristics

  • h is a one-way function: given a message x, it is easy to compute h(x), but hard to derive x from existing h(x).
  • Given   x, it is hard to find another message x’, such that h(x) = h(x’).
  • h needs to be collision-resistance; it hard to find two random messages, x and x’, such that h(x)=h(x’).

To create a digital signature, y, for a message, x, the signer, A, firstly compute hash function of the message, h(x), and then encrypts it using a private key to obtain the signature. Thus,

y=K_{A}^{Pr}(h(x)).

The pair (x,y) is then transmitted. In the destination, the receiver, B, uses the same hash function to create a hash of the real message, h(x), and then decrypts y using A‘s public key, and verify that y is valid signature if and only if

h(x) = K_{A}^{Pu}(y).

Last updated: Townsville, March 13, 2014

Posted in Cryptography, Mathematics | Tagged , , , , , , | Leave a comment

Greeting

This blog is dedicated to publish my small works in Mathematics and Cryptography. They include lecture notes, slide presentations, published and unpublished papers.

Posted in Cryptography, Mathematics, Research and Learning, Watermarking | Leave a comment