string_view does not own string data
C++17 adds std::string_view, which is a thin view of a character array, holding just a pointer and a length. This makes it easy to provide just one method that can efficiently take either a const char*, or a std::string, without unnecessary copying of the underlying array. For instance:
void use_string(std::string_view str);
You can then call that function like so:
use_string("abc");
or
std::string str("abc"); use_string(str);
This involves no deep copying of the character array until the function’s implementation needs to do that. Most obviously, it involves no copying when you are just passing a string literal to the function. For instance it doesn’t create a temporary std::string just to call the function, as would be necessary if the function took std::string.
string_view knows nothing of null-termination
However, though the string literal (“abc”) is null-terminated, and the std::string is almost-certainly null-terminated (but implementation defined), our use_string() function cannot know for sure that the underlying array is null terminated. It could have been called liked so:
const char* str = "abc"; // null-terminated use_string(std::string_view(str, 2));Â // not 3.
or even like so:
const char str[] = {'a', 'b', 'c'}; //not null-terminated use_string(std::string_view(str, 3));
or as a part of a much larger string that we are parsing.
Unlike std::string, there is no std::string_view::c_str() which will give you a null-terminated character array. There is std::string_view::data() but, like std::string::data(), that doesn’t guarantee the the character array will be null-terminated. (update: since C++11, std::string::data() is guaranteed to be null-terminated, but std::string_view::data() in C++17 is not.)
So if you call a typical C function, such as gtk_label_set_text(), you have to construct a temporary std::string, like so:
void use_string(std::string_view str) { gtk_label_set_text(label, std::string(str).c_str()); }
But that creates a copy of the array inside the std::string, even if that wasn’t really necessary. std::string_view has no way to know if the original array is null-terminated, so it can’t copy only when necessary.
This is understandable, and certainly useful for pure C++ code bases, or when using C APIs that deal with lengths instead of just null termination. I do like that it’s in the standard library now. But it’s a little disappointing in my real world of integrating with typical C APIs, for instance when implementing gtkmm.
Implementation affecting the interface
Of course, any C function that is expected to take a large string would have a version that takes a length. For instance, gtk_text_buffer_set_text(), so we can (and gtkmm will) use std::string_view as the parameter for any C++ function that uses that C function. But it’s a shame that we can’t have a uniform API that uses the same type for all string parameters. I don’t like when the implementation details dictate the types used in our API.
There is probably no significant performance issue for small strings, even when using the temporary std::string() technique, but it wouldn’t be nice to make the typical cases worse, even just theoretically.
In gtkmm, we could create our own string_view type, which is aware of null termination, but we are trying to be as standard as possible so our API is as obvious as possible.
There are some missing underscores in your code example which makes it very misleading :)
Thanks. I found (and fixed) some missing closing parentheses, but I don’t see the missing underscores. Can you give me another clue, please?
Hmm.. that’s weird.
This is how I see your page in my browser:
https://s24.postimg.org/511trtj45/Screenshot_from_2017-06-26_13_33_59.png
FF54.0, with ublock, https everywhere, and a bunch of other addons :)
Maybe those messing with something here. idk.
I don’t know if I’m looking at a screenshot of part of the page, or the whole page.
`string_view::to_string()` is not in C++17 because that would make `string_view` depend on `std::string` and we wanted the dependency to be the other way around: `string_view` only depends on `char_traits` and then `string` depends on `string_view`. So to create a `string` from a `string_view` you just construct it: `string(str)`
It wasn’t done that way for `std::experimental::string_view` because the TS that defined that couldn’t make changes to `std::string` as that was defined outside the TS. When we merged `string_view` into the standard that restriction no longer existed.
Thanks. I’ve updated the code example and removed the mention of to_string().
Since C++11 std::string::data() is guaranteed to be null-terminated (it’s identical to c_str now).
In practice that was always true even before C++11.
> std::string::data() is guaranteed to be null-terminated
That’s true, but an std::string_view might represent a substring of that string and thus not have a \0 right after its last character.
Thanks. I’ve updated the text to mention that.
Since C++17 std::string_view::data() is guaranteed to be null-terminated, thus you can use it universally in the C++ API interface of your codebase if you’re abstracting/wrapping a certain C library’s API calls.
So, if you can use a recent C++17 compiler, the author’s example of interfacing with the ‘gtk_label_set_text()’ C function from the gtk library can avoid the C++11 version which does the copy in order to ensure null-termination,
as follows: (C++11)
void use_string(std::string_view str) {
gtk_label_set_text(label, std::string(str).c_str());
}
to : (C++17)
void use_string(std::string_view str) {
gtk_label_set_text(label, str.data());
}
no copy — simply returns a pointer to the first element in the char array (null-terminated by the standard implementation design).
thus, you get the benefit of the constant complexity
which leads to perfect zero-overhead of passing strings around.
I may not take everything in the count, but that’s how I use it and seems to work flawless, no dangling pointers whatsoever.
This is dangerously wrong. It’s std::string::data() that became guaranteed to be NUL-terminated in C++17. std::string_view debuted in that version, and its ::data() is *not* NUL-terminated (necessarily). Your code here is incredibly dangerous.
In fact, std::string::data() became guaranteed to be NUL-terminated in way back in C++11. So there isn’t even the excuse of accidentally reading the wrong class name.
std::string_view can be considered as a convenient C++ wrapper around the ” strn” C-api.
C++ is missing something like
class cstr_view{
const char* s_;
public:
explicit cstr_view(const char* s): s_(s){};
template cstr_view(const char(&s)[N]):s_(s){};
template cstr_view(const String& string):s_(string.c_str()){};
const char* c_str(){return s_;};
explicit operator const char*(){return s_;};
size_t length(){return strlen(s_);};
bool empty(){ return *s_;};
std::string_view substr(size_t pos, size_t count = npos);
cstr_view remove_prefix(size_t n);
std::string_view remove_postfix(size_t n);
// ….
}
the template arguments got mangled, probably interpreted as HTML.
Although you’ve removed use of `to_string` from the code, the text of the post still *says* it uses `to_string()`:
> you have to use std::string_view::to_string(), like so:
>
> void use_string(std::string_view str) {
> gtk_label_set_text(label, std::string(str).c_str());
> }
At least until you read through the comments, this is rather confusing to say the least.
Fixed. Thanks.
For a method taking a single string argument, you could avoid the data copy in the std::string case by providing a second version of the method taking a const std::string reference.
This could get unwieldy if you wanted to handle all variations for a method taking multiple string arguments, but maybe it would be enough to provide only all string_view and all string versions.