StringView

A StringView references an immutable range of memory. It consists of a pointer bytes and an integer numBytes. A StringView does not own the memory it points to, and no heap memory is freed when the StringView is destroyed. The caller is responible for ensuring that memory referenced by StringView remains valid for the lifetime of the StringView object.

StringView objects often reference UTF-8-encoded text, but this isn't a requirement. The memory referenced by a StringView can hold any kind of data.

Even though StringView objects aren't required to contain text, they provide many member functions to help manipulate text, such as trim(), splitByte() and upperAsc(). These text manipulation functions are generally intended to work with UTF-8-encoding strings, but most of them will also work with any 8-bit format compatible with ASCII, such as ISO 8859-1 or Windows-1252.

When a StringView object does contain text, the text string is generally not null-terminated. If you need a null-terminated string (for example, to pass to a third-party library), you must construct a string that includes the null terminator byte yourself. The null terminator then counts towards the total number of bytes in numBytes. A convenience function withNullTerminator() is provided for this.

Several StringView functions accept byte offset arguments, such as subStr(), left() and right(). Be aware that when a StringView contains UTF-8-encoded text, byte offsets are not necessarily the same as the number of characters (Unicode points) encoded by the string. For more information, see Unicode Support.

Header File

#include <ply-runtime/string/StringView.h>

Also included from <ply-runtime/Base.h>.

Data Members

const char* bytes [code]: The first byte in the immutable memory range.
u32 numBytes [code]: The number of bytes in the immutable memory range.

Member Functions

StringView::StringView() [code]

Constructs an empty StringView.

StringView::StringView(const char* s) [code] template <typename U, std::size_t N, std::enable_if_t<std::is_same<U, std::decay_t<decltype(*u8"")>>::value, bool> = false> StringView::StringView(const U& s[]) [code]

Constructs a StringView from a null-terminated string. The string memory is expected to remain valid for the lifetime of the StringView. Note that the null terminator character does not count towards numBytes. For example, StringView{"hello"} results in numBytes equal to 5.

When the argument is a C-style string literal, compilers are able to compute numBytes at compile time if optimization is enabled.

The second form of this constructor exists in order to support C++20 compilers, where the type of UTF-8 string literals such as u8"hello" has been changed from const char* (before C++20) to const char8_t* (after C++20). StringView simply interprets such literals as const char*.

StringView::StringView(const char& c) [code]

Constructs a StringView from a single byte. c is expected to remain valid for the lifetime of the StringView.

StringView::StringView(const char* bytes, u32 numBytes) [code]

Constructs a StringView explicitly from the arguments. The string memory is expected to remain valid for the lifetime of the StringView.

static StringView StringView::fromRange(const char* startByte, const char* endByte) [code]

Returns a StringView referencing an immutable range of memory between two pointers. The number of bytes in the memory range is given by endByte - startByte, and endByte is considered a pointer to the first byte after the memory range.

const char& StringView::operator[](u32 index) const [code]

Subscript operator with runtime bounds checking.

const char& StringView::back(s32 ofs = -1) const [code]

Reverse subscript operator with runtime bound checking. Expects a negative index. -1 returns the last byte of the given string; -2 returns the second-last byte, etc.

const char* StringView::end() const [code]

Returns bytes + numBytes. This pointer is considered to point to the first byte after the memory range.

void StringView::offsetHead(u32 numBytes) [code]

Moves bytes forward and subtracts the given number of bytes from numBytes.

void StringView::offsetBack(s32 ofs) [code]

Advances the end of the memory range by ofs bytes while keeping the start of the memory range unchanged.

template <typename T> T StringView::to(const T& defaultValue = subst::createDefault<T>()) const [code]

Parse the given string directly as Type. Whitepsace is trimmed from the beginning and end of the string before parsing occurs. If the string cannot be parsed, or if the string is not completely consumed by the parse operation, defaultValue is returned.

StringView{"123"}.to<u32>();    // returns 123
StringView{" 123 "}.to<u32>();  // returns 123
StringView{"abc"}.to<u32>();    // returns 0
StringView{"123a"}.to<u32>();   // returns 0
StringView{""}.to<u32>();       // returns 0
StringView{""}.to<s32>(-1);     // returns -1

This function uses ViewInStream internally. If you need to distinguish between a successful parse and an unsuccessful one, create and use a ViewInStream object directly.

explicit StringView::operator bool() const [code]

Explicit conversion to bool. Returns true if the length of the string is greater than 0. Allows you to use a String object inside an if condition.

if (str) {
    ...
}

bool StringView::isEmpty() const [code]

Returns true if the length of the string is 0.

StringView StringView::subStr(u32 start) const [code] StringView StringView::subStr(u32 start, u32 numBytes) const [code]

Returns a substring that starts at the offset given by start. The optional numBytes argument determines the length of the substring in bytes. If numBytes is not specified, the substring continues to the end of the string.

bool StringView::contains(const char* curByte) const [code]

Returns true if curByte points to a byte inside the memory range.

StringView StringView::left(u32 numBytes) const [code]

Returns a substring that contains only the first numBytes bytes of the string.

StringView StringView::shortenedBy(u32 numBytes) const [code]

Returns a substring with the last numBytes bytes of the string omitted.

StringView StringView::right(u32 numBytes) const [code]

Returns a substring that contains only the last numBytes bytes of the string.

s32 StringView::findByte(char matchByte, u32 startPos = 0) const [code]

Returns the offset of the first occurence of matchByte in the string, or -1 if not found. The search begins at the offset specified by startPos. This function can find ASCII codes in UTF-8 encoded strings, since ASCII codes are always encoded as a single byte in UTF-8.

template <typename MatchFunc> s32 StringView::findByte(const MatchFunc& matchFunc, u32 startPos = 0) const [code]

A template function that returns the offset of the first byte for which matchFunc returns true, or -1 if none. The search begins at the offset specified by startPos. This function can be used to find a whitespace character by calling findByte(isWhite).

s32 StringView::rfindByte(char matchByte, u32 startPos) const [code] template <typename MatchFunc> s32 StringView::rfindByte(const MatchFunc& matchFunc, u32 startPos) const [code]

Reverse findByte functions. Returns the offset of the last byte in the string that matches the first argument (if passed a char) or for which the first argument returns true (if passed a function). The optional startPos argument specifies an offset at which to begin the search.

bool StringView::startsWith(StringView arg) const [code]

Returns true if the string starts with arg.

bool StringView::endsWith(StringView arg) const [code]

Returns true if the string ends with arg.

StringView StringView::trim(bool* matchFunc(char) = isWhite, bool left = true, bool right = true) const [code] StringView StringView::ltrim(bool* matchFunc(char) = isWhite) const [code] StringView StringView::rtrim(bool* matchFunc(char) = isWhite) const [code]

Returns a substring with leading and/or trailing bytes removed. Bytes are removed if true is returned when passed to matchFunc. These functions can be used to trim whitespace characters from a UTF-8 string, for example by calling trim(isWhite), since whitespace characters are each encoded as a single byte in UTF-8.

ltrim() trims leading bytes only, rtrim() trims trailing bytes only, and trim() trims both leading and trailing bytes.

String StringView::join(ArrayView<const StringView> comps) const [code]

Returns a new String containing every item in comps concatenated together, with the given string used as a separator. For example:

StringView{", "}.join({"a", "b", "c"}); // returns "a, b, c"
StringView{""}.join({"a", "b", "c"});   // returns "abc"

Array<StringView> StringView::splitByte(char sep) const [code]

Returns a list of the words in the given string using sep as a delimiter byte.

String StringView::upperAsc() const [code]

Returns a new String with all lowercase ASCII characters converted to uppercase. This function works with UTF-8 strings. Also works with any 8-bit text encoding compatible with ASCII.

String StringView::lowerAsc() const [code]

Returns a new String with all uppercase ASCII characters converted to lowercase. This function works with UTF-8 strings. Also works with any 8-bit text encoding compatible with ASCII.

String StringView::reversedBytes() const [code]

Returns a new String with the bytes reversed. This function is really only suitable when you know that all characters contained in the string are encoded in a single byte.

String StringView::reversedUTF8() const [code]

Returns a new String with UTF-8 characters reversed. For example, if s contains the string "😋🍺🍕" encoded as UTF-8, s.reversedUTF8() returns the string "🍕🍺😋" encoded as UTF-8.

String StringView::filterBytes(char* filterFunc(char)) const [code]

Returns a new String with each byte passed through the provided filterFunc. It's safe to call this function on UTF-8 encoded strings as long as filterFunc leaves byte values greater than or equal to 128 unchanged. Therefore, this function is mainly suitable for filtering ASCII codes.

bool StringView::includesNullTerminator() const [code]

Returns true if the last byte in the string is a zero byte.

HybridString StringView::withNullTerminator() const [code]

If the last byte of the given string is not a zero byte, this function allocates memory for a new string, copies the contents of the given string to it, appends a zero byte and returns the new string. In that case, the new string's numBytes will be one greater than the numBytes of the original string. If the last byte of the given string is already a zero byte, a view of the given string is returned and no new memory is allocated.

StringView StringView::withoutNullTerminator() const [code]

If the last byte of the given string is not a zero byte, returns a view of the given string. If the last byte of the given string is a zero byte, returns a substring with the last byte omitted.

s32 compare(StringView a, StringView b) [code]

Returns:

-1 if a precedes b in sorted order
0 if the strings are equal
1 if a follows b in sorted order

Strings are sorted by comparing the unsigned value of each byte. If one of the strings contains the other as a prefix, the shorter string comes first in sorted order.

bool operator==(StringView a, StringView b) [code] bool operator!=(StringView a, StringView b) [code] bool operator<(StringView a, StringView b) [code] bool operator<=(StringView a, StringView b) [code] bool operator>(StringView a, StringView b) [code] bool operator>=(StringView a, StringView b) [code]

Comparison functions. a < b if a precedes b in sorted order.

String operator+(StringView a, StringView b) [code]

Returns a new String containing the concatenation of two StringViews.

String operator*(StringView str, u32 count) [code]

Returns a new String containing the contents of the given StringView repeated count times.

StringView{'*'} * 10;    // returns "**********"