ENCRY Aug 1 2022 at 11:06

Systematic coding and digital signature

10 min

2.7K

Tutorial

Once the Teacher asked the Author:

Are there methods of redundancy introducing at an informational level, other than those that are studied by the theory of error-correcting codes? Emphasizing that he is talking about information redundancy, the Teacher thus made it clear that the question does not imply various ways of energy redundancy introducing, which are well studied in communication theory. After all, the noise immunity of information transmission is traditionally assessed by means of a threshold value that is calculated as the ratio of signal energy to noise energy. It is known that the methods of the theory of error-correcting codes offer an alternative solution, allowing energy saving.

After a cogitative pause, the Author answered in the affirmative, following intuition rather than rational knowledge. Upon hearing the answer, the Teacher noticed that this is a wrong conclusion and there are no such methods.

However, over time, the Author began to suspect that the immutability of the paradigm formulated above could be questioned. First of all, because it is reasonable to consider digital signature (DS) as some constructive way of redundancy introducing, which, however, due to specific functional properties, cannot be attributed to any known class of objects in the theory of error-correcting codes. In short, any code is similar to the DS in terms of error detection, but it does not have other unique properties.

It should be noted that we are not talking about the issues of the so-called "code cryptography", whose purpose is the usage of methods of the error-correcting codes theory for constructing cryptographic primitives. In fairness, we note that the mentioned concept attracts the attention of specialists due to its potential possibility of developing cryptographic primitives that are resistant to attacks using a quantum computer. In addition, even if we address DS schemes based on methods of the error-correcting codes theory, and such are known, for example, the scheme of G. A. Kabatyansky, E. A. Kruk and B. J. M. Smits (Kabatianskii G., Krouk E., Smeets B.J.M. A Digital Signature Scheme Based on Random Error-Correcting Codes // Lect. Notes in Comp. Sci. Springer-Verlag. 1997. Vol. 1355. Pp.161-167), then the reasoning presented below acquires additional meaning.

This note should be regarded as an attempt to explain the functional similarities and differences between the DS mechanism and methods of the error-correcting codes theory, in particular systematic coding.

Taking into account that the theory of error-correcting codes is known for its rigorous discourse that makes it difficult to reveal key ideas to neophytes, we decided not to focus on algebraic notations but concentrate attention on an interpretative manner of presentation without excessive formalism. It’s up to readers to decide whether we have succeeded in our pursuits.

At first glance, if we confine ourselves to practical application, despite the fact that there is also a purely theoretical aspect, which we will not touch upon, error-correcting codes offer a number of solutions for compensating the distortion of useful information during transmission over noisy channels. Basically, it's about finding and fixing errors. Let's briefly explain.

Errors occur when transmitting information over a noisy communication channel. The nature of these errors vary, but one thing is important: what is put in the channel may differ from what is received in output. The channel model is essential. Many different models are known, for example, a channel with erasures, a channel with insertions and fallouts of symbols, and so on, but we will rely on the most famous and simple one – the binary memoryless symmetric channel (BSC) model. From its name it follows that we are dealing with the binary alphabet and arithmetic modulo two. In this model, a binary 1 at the input of the channel can go to a binary 1 at the output with some probability P and go to a binary 0 with a probability of 1-P. The same mechanism applies to binary 0. That’s why such a model is called symmetrical. There is a generalization of this model to q-ary alphabets, where q is some power of a prime. The channel is memoryless, that is, all previous transitions do not have any effect on subsequent ones. Error detection and correction is carried out in the context of the BSC model.

In particular, if the position where the error occurred is known, then in the case of BSC, it is sufficient for correction to evaluate modulo two addition of the symbol at this position with a binary 1. Indeed, if 0 distorts to 1, then 1+1=0, and if 1 distorts to 0, then 0+1=1. In the case of q-ary alphabets, in order to correct the error, one must know not only the position, but also the value of the error.

If approached constructively, some information redundancy should be introduced at the input of the channel, so that then it could be used at the output to detect and correct errors. In essence, the theory of error-correcting codes studies various ways to introduce redundancy and the quantitative estimates associated with it. We will not dig into details discussing the concepts of the minimum code distance, code rate and other important parameters. All necessary information about them can be gleaned from textbooks, monographs, and publications of varying degrees of complexity.

We’ll apply the term ‘coding’ to the method of redundancy introduction, and ‘decoding’ to the detection and correction of errors using the previously introduced redundancy. A ‘codeword’ is a result of coding. It is transmitted over the channel with noise. In fact, a codeword is information symbols, to which, in accordance with a certain rule, additional check symbols are added. The decoding allows to determine the positions where errors occurred, and for the q-ary alphabet, the value of the error in the corresponding position.

For further reasoning, we need to define the concept of systematic coding. In the standard notation, the coding procedure is reduced to such a linear algebra operation as the multiplication of a vector by a matrix. Moreover, the matrix, known as a generating one, can be arranged in such a way that the original information symbols in the resulting codeword will be separated from the check symbols in such a way that they can be easily distinguished. There are other encoding methods based on other algebraic transformations that differ from vector-matrix multiplication, but they do not provide systematic encoding.

Linearity is another important property of the code. If there are two different codewords belonging to a particular code, then linearity means that the bitwise addition of these codewords makes it possible to obtain another codeword of the same code. For simplicity, we will assume that the code is binary and the addition is performed modulo two. In other words, this is a closed space of codewords with respect to a certain linear operator. For example, the operator of addition. There are also non-linear codes without such a property.

The simplest methods allow only to detect errors without correction. Sometimes this is enough. In this case, the decoding leads to creation of a sign that indicates the presence or absence of errors. However, in this case, there is no information about positions with erroneous symbols. A decoding failure is also possible. It occurs when the number of errors is greater than the error-correcting capability of the code.

The information presented above is sufficient for comparing DS schemes and methods of systematic coding with error detection.

To begin with, it is necessary to describe the purpose and rules for DS usage. The one who creates a DS will be called a witness.

1. The DS is generated by the witness based on the given message and the secret key known only by the witness.

2. The DS and the message are logically connected and can be transmitted over unsecured communication channels or stored in a publicly accessible long-term memory both together and separately.

3. The verification is based on the DS for some message, which does not necessarily match the original message, as well as on the current public key of the witness, the authenticity and integrity of which is confirmed by a valid certificate.

4. The DS check results in the value of a Boolean variable, which takes a true if the signature is valid (the message corresponds to the one for which the DS was previously generated), otherwise the Boolean variable takes a false.

5. The private and public keys constitute a unique pair. Moreover, the secret key is known only to the witness, while the public key is delivered upon request, or is located in the public long-term memory and can be stored either as part of the certificate or separately.

6. The public key certificate has a limited validity period. After the expiration of this period, the key pair is withdrawn. As a result, the DS generated by means of the secret key from this pair does not guarantee the authenticity and integrity of the message signed.

7. Witnesses are classified as honest and dishonest ones. The honest witness follows the established rules for signing, while the dishonest one is not limited by anything and acts like an attacker.

8. The verifier always follows the prescribed rules of signature verification.

9. Anyone who has a pair of keys is able to generate DS for some message, the validity of which can be verified by anyone who has access to the current public key, the authenticity and integrity of which is confirmed by a valid certificate or other reliable method.

10. An attacker seeks to change/replace a message with an unknown secret key so that the fact of falsification could not be detected during signature verification: the witness initially generates DS for a given message, and an attacker changes/replaces DS and/or a given message with alternative ones, such that, when checking the validity, the Boolean variable takes on a true.

11. Apart from falsification, forgery is also possible: an attacker generates a DS for a given message with an unknown secret key, such that when checking for validity using the public key of the witness on whose behalf the attacker generated this DS, the Boolean variable takes on a true.

12. For the purpose of forgery and/or falsification, an attacker can operate with a single signature for a particular message or signatures for dissimilar messages, as well as many different signatures for a fixed message.

13. An attacker is able to impose on the witness many different messages and force him to generate DS for them. It should be considered a common practice to prove the security level of the DS scheme in the focus of the model of existential falsification/forgery in a message brute force attack. The peculiarity of the model is that the message is changed/chosen arbitrarily, without saving the semantic content. The security level is interpreted as the inability of an attacker, with an unknown secret key, having polynomially limited resources, both computing and memory, to create DS for a certain message using a set of arbitrary signed messages , such that for a message that does not belong to the mentioned set, when checking for validity, a Boolean variable takes on a true.

Let us formulate provisions that allow us to understand the existing similarities and differences.

1. DS schemes and codes that detect/correct errors use a single mathematical apparatus – arithmetic operations are performed in a finite field or its extension within the boundaries of such algebraic structures as a multiplicative group of a finite field or a group of points of an elliptic curve, if algebraic geometry codes and DS schemes on elliptic curves.

2. The redundancy, which in the DS scheme is involved in the procedure for checking the signature for validity, is separated from the information symbols.

3. The systematic code and DS scheme make it possible to detect errors.

4. For a systematic code, the error probability is determined by the nature of the communication channel, while in the case of DS, the error is also man-made.

5. The detecting ability of a systematic code depends on the number of information and check symbols. These parameters are chosen in advance, taking into account the probability of errors in the channel.

6. If there is a limit on the number of information symbols in DS schemes (depends on the hash function used), then this limit does not affect practical application, but the number of redundant symbols is fixed.

7. The probability of DS falsification depends on the probability of a hash function collision. Collisions always exist and can be found, however, the asymptotic estimate of the complexity (number of trials) of the universal method for collisions finding for an arbitrary hash function does not exceed the square root of the cardinality of the value space. As a rule, this is not less than 2²⁵⁶.

8. The decoding error probability, when a sign indicating the absence of errors, despite the fact that there were errors which led to the transformation of one codeword into another, can be calculated in advance, taking into account the error probability in the channel and code parameters.

9. A systematic code within the error-detection capability guarantees a service such as integrity, but does not guarantee the authenticity.

10. The DS scheme guarantees both authenticity and integrity.

11. The linearity property for DS schemes is not fulfilled. If such a property was fulfilled, then the DS would be vulnerable in terms of falsification/forgery.

12. As a rule, systematic block codes have a fixed block length (in alphabetical symbols). The parameters chosen with reference to the estimate of the probability of erroneous decoding, namely the number of information and check symbols, are also fixed.

13. DS schemes do not provide a final conclusion similar to decoding failure.

14. DS schemes are not bound to a specific channel model, but are focused on the existential falsification/forgery model, where symbol transformations, as well as insertions and fallouts of symbols, are possible at the same time.

Let us assume that all the above mentioned is sufficient to formulate following conclusions.

From the point of view of redundancy and functional content, DS schemes differ from all other cryptographic tools. The techniques that provide confidentiality, namely block encryption/decryption algorithms, are non-redundant in nature. As for the message authentication code in the context of the conditional authenticity model, where the sender and receiver share a joint secret key and trust each other, then the similarity to the error-detecting systematic block code is undeniable. However, unlike DS, the message authentication code does not guarantee unconditional authenticity.

First of all, it is obvious that, just like a systematic block code, the DS scheme provides error detection. However, since no significant restrictions are imposed on the number of information symbols at a fixed redundancy, it can be assumed that, all other factors held equal, the probability of falsification/forgery for DS schemes is significantly lower than the probability of erroneous decoding. For more precise reasoning, it is necessary to use an adequate mathematical apparatus, but this is not consistent with our previously stated intention to use a simplified manner of explanation.

It is possible that the correct comparison lies in the field of non-linear codes, because, as indicated in paragraph 11, the linearity for DS schemes is a parasitic property that significantly increases the risk of falsification/forgery.

No other similarities can be found, and the identified differences are of a fundamental nature. The main point is that a systematic block code is not a tool of asymmetric cryptography and therefore does not guarantee authenticity. In general, when discussing codes that detect/correct errors, such functionality is not even mentioned. This is understandable, the methods of the theory of error-correcting codes are related to different tasks. But at the same time, coding and DS certainly belong to the methods of redundancy introduction. And the decoding and verification of DS use this redundancy for the final conclusion within the framework of the established decision rule.

Thus, are there other methods of redundancy introduction at the informational level, except for those that are studied by the theory of error-correcting codes? Yes, there are. These methods include DS schemes.

Hubs: