Strncpy And Safety

Introduction

Some time ago I have seen a presentation on secure coding where the author claimed that strcpy-family of functions is not secure and instead the strncpy-family should be used.

Instead - this is the crucial part of the whole "recommendation" and it seems to be quite widespread among bug hunters and security consultants. In fact, this recommendation is even part of some coding standards.

In this article I will debunk the myth of "secure strncpy" and show that from the safety practice point of view the strncpy-family of functions is actually weaker and more error prone.

The Strncpy Myth

First, let's take a look at the central motivation for using strncpy - the assumption that it protects the buffer from overrun errors.

char short_buffer[5];
size_t big_number = 10;
const char * long_string = "This is some long string to copy.";

strncpy(short_buffer, long_string, big_number);

The above code example overruns the target buffer and most likely leads to undefined behaviour.

Where is the "protection", then? The strncpy cannot protect anything on its own, because it has no inherent knowledge of the buffers that are in use. It cannot get this knowledge from the target buffer itself, because the arrays in C are not self-aware in terms of their lengths. The only thing that the strncpy function can offer is the assistance in checking the array boundaries, which has to be still organized by the programmer. In other words, the programmer still has the responsibility for proper buffer length management so that at every point where strings are copied the length of the target buffer is known - and it has to be correct.

We could conclude that strncpy is safer exactly because of this assistance in checking buffer lengths, but even such conclusion would be too optimistic. Just protecting the buffer from overrun is not enough to declare that the program is safe - the program cannot just keep running with incorrect data and incorrect data is what we have when the strings are altered by truncation when not intended. Consider for example a post-office application that uses strncpy for copying addresses and where the destination address gets truncated somewhere at the end due to buffer limitations - this is not an acceptable outcome. Interestingly, the subject of incorrect information is rarely mentioned in books and articles that recommend to use strncpy.

Proper use of strncpy requires some scaffolding and the example code might be:

strncpy(target_buffer, source_string, target_buffer_length);
target_buffer[target_buffer_length - 1] = '\0';

if (strcmp(target_buffer, source_string) != 0)
{
    /* oops, the string was not copied correctly */
    /* do something with it...                   */
}

In the above example, the program verifies if the string was properly copied by comparing the result with the original. If they are different, then the string was truncated.

Is the above safe? It works, but it is error-prone. There is nothing that actually forces the programmer to verify that the copy was correct and it is very tempting not to do it. For a proof of how this temptation is widespread, see almost any article devoted to strncpy, for example this one from US-CERT.

Another potential problem with the above is that it detects the incorrect copy after it occurred, whereas a more plausible solution would be to prevent it in the first place. The difference is in how easy it is to organize proper error recovery or transaction-like behavior. If the string is not copied properly, then the above example will detect it after the target buffer was modified, losing whatever was there before. There is no way to recover the old information and even more code would be need to preserve the old buffer content for later recovery. This is not a problem if the target buffer is short-lived and not needed in case of error, but it might be as well a reusable buffer that is supposed to be modified either successfully or not at all.

A more plausible solution is to prevent problems instead of detecting them after the fact, and the following is an example of this approach:

if (strlen(source_string) < target_buffer_length)
{
    strncpy(target_buffer, source_string, target_buffer_length);
    /* ... */
}
else
{
    /* oops, the string cannot be copied correctly */
    /* do something with it...                     */
}

The above example prevents problems by checking if the operation can be executed properly and if not, it does not even attempt to do it. The advantage of this approach is that the original content of the target buffer is not modified, which makes it much more easy to organize proper error recovery - there is nothing to recover, actually.

What is the most interesting, however, is that in the above code there is no benefit from using strncpy at all. The old good strcpy would do as well, which would be actually shorter and would not have to repeat the target_buffer_length name:

if (strlen(source_string) < target_buffer_length)
{
    strcpy(target_buffer, source_string);
    /* ... */
}
else
{
    /* oops, the string cannot be copied correctly */
    /* do something with it...                     */
}

Contrary to popular beliefs, the above version is better than any version with strncpy, because not only it protects the target buffer from overrun, but it also prevents incorrect information from being created and it does it in a compact form.

Does it mean that strncpy is useless? No. It still has its place in the library, but the only motivation for using it is when the truncation is intended - for example to provide fixed-length roots for hash functions or for indexing or for partial display in length-limited fields or ...

Interestingly, this motivation has nothing to do with safety and security.