C++: std::string_view not so useful when calling C functions

string_view does not own string data

C++17 adds std::string_view, which is a thin view of a character array, holding just a pointer and a length. This makes it easy to provide just one method that can efficiently take either a const char*, or a std::string, without unnecessary copying of the underlying array. For instance:

void use_string(std::string_view str);

You can then call that function like so:

use_string("abc");

or

std::string str("abc");
use_string(str);

This involves no deep copying of the character array until the function’s implementation needs to do that. Most obviously, it involves no copying when you are just passing a string literal to the function. For instance it doesn’t create a temporary std::string just to call the function, as would be necessary if the function took std::string.

string_view knows nothing of null-termination

However, though the string literal (“abc”) is null-terminated, and the std::string is almost-certainly  null-terminated (but implementation defined), our use_string() function cannot know for sure that the underlying array is null terminated. It could have been called liked so:

const char* str = "abc"; // null-terminated
use_string(std::string_view(str, 2));  // not 3.

or even like so:

const char str[] = {'a', 'b', 'c'}; //not null-terminated
use_string(std::string_view(str, 3));

or as a part of a much larger string that we are parsing.

Unlike std::string, there is no std::string_view::c_str() which will give you a null-terminated character array. There is std::string_view::data() but, like std::string::data(), that doesn’t guarantee the the character array will be null-terminated. (update: since C++11, std::string::data() is guaranteed to be null-terminated, but std::string_view::data() in C++17 is not.)

So if you call a typical C function, such as gtk_label_set_text(), you have to construct a temporary std::string, like so:

void use_string(std::string_view str) {
  gtk_label_set_text(label, std::string(str).c_str());
}

But that creates a copy of the array inside the std::string, even if that wasn’t really necessary. std::string_view has no way to know if the original array is null-terminated, so it can’t copy only when necessary.

This is understandable, and certainly useful for pure C++ code bases, or when using C APIs that deal with lengths instead of just null termination. I do like that it’s in the standard library now. But it’s a little disappointing in my real world of integrating with typical C APIs, for instance when implementing gtkmm.

Implementation affecting the interface

Of course, any C function that is expected to take a large string would have a version that takes a length. For instance, gtk_text_buffer_set_text(), so we can (and gtkmm will) use std::string_view as the parameter for any C++ function that uses that C function. But it’s a shame that we can’t have a uniform API that uses the same type for all string parameters. I don’t like when the implementation details dictate the types used in our API.

There is probably no significant performance issue for small strings, even when using the temporary std::string() technique, but it wouldn’t be nice to make the typical cases worse, even just theoretically.

In gtkmm, we could create our own string_view type, which is aware of null termination, but we are trying to be as standard as possible so our API is as obvious as possible.

17 thoughts on “C++: std::string_view not so useful when calling C functions

  1. There are some missing underscores in your code example which makes it very misleading :)

    1. Thanks. I found (and fixed) some missing closing parentheses, but I don’t see the missing underscores. Can you give me another clue, please?

  2. `string_view::to_string()` is not in C++17 because that would make `string_view` depend on `std::string` and we wanted the dependency to be the other way around: `string_view` only depends on `char_traits` and then `string` depends on `string_view`. So to create a `string` from a `string_view` you just construct it: `string(str)`

    It wasn’t done that way for `std::experimental::string_view` because the TS that defined that couldn’t make changes to `std::string` as that was defined outside the TS. When we merged `string_view` into the standard that restriction no longer existed.

  3. Since C++11 std::string::data() is guaranteed to be null-terminated (it’s identical to c_str now).

    In practice that was always true even before C++11.

    1. > std::string::data() is guaranteed to be null-terminated

      That’s true, but an std::string_view might represent a substring of that string and thus not have a \0 right after its last character.

    2. Since C++17 std::string_view::data() is guaranteed to be null-terminated, thus you can use it universally in the C++ API interface of your codebase if you’re abstracting/wrapping a certain C library’s API calls.

      So, if you can use a recent C++17 compiler, the author’s example of interfacing with the ‘gtk_label_set_text()’ C function from the gtk library can avoid the C++11 version which does the copy in order to ensure null-termination,
      as follows: (C++11)
      void use_string(std::string_view str) {
      gtk_label_set_text(label, std::string(str).c_str());
      }
      to : (C++17)
      void use_string(std::string_view str) {
      gtk_label_set_text(label, str.data());
      }

      no copy — simply returns a pointer to the first element in the char array (null-terminated by the standard implementation design).
      thus, you get the benefit of the constant complexity
      which leads to perfect zero-overhead of passing strings around.

      I may not take everything in the count, but that’s how I use it and seems to work flawless, no dangling pointers whatsoever.

      1. This is dangerously wrong. It’s std::string::data() that became guaranteed to be NUL-terminated in C++17. std::string_view debuted in that version, and its ::data() is *not* NUL-terminated (necessarily). Your code here is incredibly dangerous.

        1. In fact, std::string::data() became guaranteed to be NUL-terminated in way back in C++11. So there isn’t even the excuse of accidentally reading the wrong class name.

  4. std::string_view can be considered as a convenient C++ wrapper around the ” strn” C-api.

    C++ is missing something like

    class cstr_view{
    const char* s_;
    public:
    explicit cstr_view(const char* s): s_(s){};
    template cstr_view(const char(&s)[N]):s_(s){};
    template cstr_view(const String& string):s_(string.c_str()){};
    const char* c_str(){return s_;};
    explicit operator const char*(){return s_;};
    size_t length(){return strlen(s_);};
    bool empty(){ return *s_;};
    std::string_view substr(size_t pos, size_t count = npos);
    cstr_view remove_prefix(size_t n);
    std::string_view remove_postfix(size_t n);
    // ….
    }

  5. Although you’ve removed use of `to_string` from the code, the text of the post still *says* it uses `to_string()`:

    > you have to use std::string_view::to_string(), like so:
    >
    > void use_string(std::string_view str) {
    > gtk_label_set_text(label, std::string(str).c_str());
    > }

    At least until you read through the comments, this is rather confusing to say the least.

  6. For a method taking a single string argument, you could avoid the data copy in the std::string case by providing a second version of the method taking a const std::string reference.

    This could get unwieldy if you wanted to handle all variations for a method taking multiple string arguments, but maybe it would be enough to provide only all string_view and all string versions.

Comments are closed.