class String
Overview
A String represents an immutable sequence of UTF-8 characters.
A String is typically created with a string literal, enclosing UTF-8 characters in double quotes:
"hello world"
A backslash can be used to denote some characters inside the string:
"\"" # double quote
"\\" # backslash
"\e" # escape
"\f" # form feed
"\n" # newline
"\r" # carriage return
"\t" # tab
"\v" # vertical tab
You can use a backslash followed by at most three digits to denote a code point written in octal:
"\101" # == "A"
"\123" # == "S"
"\12" # == "\n"
"\1" # string with one character with code point 1
You can use a backslash followed by an u and four hexadecimal characters to denote a unicode codepoint written:
"\u0041" # == "A"
Or you can use curly braces and specify up to six hexadecimal numbers (0 to 10FFFF):
"\u{41}" # == "A"
A string can span multiple lines:
"hello
world" # same as "hello \nworld"
Note that in the above example trailing and leading spaces, as well as newlines, end up in the resulting string. To avoid this, you can split a string into multiple lines by joining multiple literals with a backslash:
"hello " \
"world, " \
"no newlines" # same as "hello world, no newlines"
Alternatively, a backslash followed by a newline can be inserted inside the string literal:
"hello \
world, \
no newlines" # same as "hello world, no newlines"
In this case, leading whitespace is not included in the resulting string.
If you need to write a string that has many double quotes, parentheses, or similar characters, you can use alternative literals:
# Supports double quotes and nested parentheses
%(hello ("world")) # same as "hello (\"world\")"
# Supports double quotes and nested brackets
%[hello ["world"]] # same as "hello [\"world\"]"
# Supports double quotes and nested curlies
%{hello {"world"}} # same as "hello {\"world\"}"
# Supports double quotes and nested angles
%<hello <"world">> # same as "hello <\"world\">"
To create a String with embedded expressions, you can use string interpolation:
a = 1
b = 2
"sum = #{a + b}" # "sum = 3"
This ends up invoking Object#to_s(IO)
on each expression enclosed by #{...}
.
If you need to dynamically build a string, use String#build
or MemoryIO
.
Included Modules
Defined in:
string.crbig/big_int.cr
big/big_float.cr
json/to_json.cr
yaml/to_yaml.cr
Class Method Summary
-
.build(capacity = 64, &block) : self
Builds a String by creating a
String::Builder
with the given initial capacity, yielding it to the block and finally getting a String out of it. - .new(capacity : Int, &block)
- .new(pull : JSON::PullParser)
- .new(pull : YAML::PullParser)
-
.new(chars : Pointer(UInt8), bytesize, size = 0)
Creates a new String from a pointer, indicating its bytesize count and, optionally, the UTF-8 codepoints count (size).
-
.new(bytes : Slice(UInt8), encoding : String, invalid : Symbol | Nil = nil) : String
Creates a new String from the given bytes, which are encoded in the given encoding.
-
.new(chars : Pointer(UInt8))
Creates a String from a pointer.
-
.new(slice : Slice(UInt8))
Creates a String from the given slice.
Instance Method Summary
-
#%(other)
Interpolates other into the string using
Kernel#sprintf
-
#*(times : Int)
Makes a new string by adding str to itself times times.
-
#+(other : self)
Concatenates str and other.
-
#+(char : Char)
Concatenates str and other.
-
#<=>(other : self)
Compares this string with other, returning -1, 0 or +1 depending on whether this string is less, equal or greater than other.
- #==(other : self)
-
#=~(other)
Tests whether str matches regex.
-
#=~(regex : Regex)
Tests whether str matches regex.
- #[](str : String)
-
#[](range : Range(Int, Int))
Returns a substring by using a Range's begin and end as character indices.
- #[](regex : Regex, group)
-
#[](start : Int, count : Int)
Returns a substring starting from the
start
character of size#count
. - #[](regex : Regex)
-
#[](index : Int)
Returns the
Char
at the given index, or raisesIndexError
if out of bounds. - #[]?(regex : Regex, group)
- #[]?(regex : Regex)
- #[]?(str : String)
- #[]?(index : Int)
- #ascii_only?
- #at(index : Int, &block)
- #at(index : Int)
- #byte_at(index)
- #byte_at(index, &block)
- #byte_at?(index)
- #byte_index(string : String, offset = 0)
- #byte_index(byte : Int, offset = 0)
-
#byte_index_to_char_index(index)
Returns the char index of a byte index, or nil if out of bounds.
- #byte_slice(start : Int, count : Int)
- #byte_slice(start : Int)
- #bytes
-
#bytesize : Int32
Returns the number of bytes in this string.
-
#camelcase
Converts underscores to camelcase boundaries.
-
#capitalize
Returns a new string with the first letter converted to uppercase and every subsequent letter converted to lowercase.
- #char_at(index)
-
#char_index_to_byte_index(index)
Returns the byte index of a char index, or nil if out of bounds.
-
#chars
Returns an array of all characters in the string.
-
#check_no_null_byte
Raises an
ArgumentError
ifself
has null bytes. -
#chomp(char : Char)
Returns a new String with char removed if the string ends with it.
-
#chomp
Returns a new String with the last carriage return removed (that is, it will remove \n, \r, and \r\n).
-
#chomp(str : String)
Returns a new String with str removed if the string ends with it.
-
#chop
Returns a new String with the last character removed.
- #codepoint_at(index)
-
#codepoints
Returns an array of the codepoints that make the string.
-
#compare(other : String, case_insensitive = false)
Compares this string with other, returning -1, 0 or +1 depending on whether this string is less, equal or greater than other, optionally in a case_insensitive manner.
-
#count(&block)
Yields each char in this string to the block, returns the number of times the block returned a truthy value.
-
#count(other : Char)
Counts the occurrences of other in this string.
-
#count(*sets)
Sets should be a list of strings following the rules described at Char#in_set?.
-
#delete(&block)
Yields each char in this string to the block.
-
#delete(char : Char)
Returns a new string with all occurrences of char removed.
-
#delete(*sets)
Sets should be a list of strings following the rules described at Char#in_set?.
-
#downcase
Returns a new string with each uppercase letter replaced with its lowercase counterpart.
- #dump(io)
- #dump
- #dump_unquoted(io)
- #dump_unquoted
-
#each_byte
Returns an iterator over each byte in the string.
-
#each_byte(&block)
Yields each byte in the string to the block.
-
#each_char
Returns an iterator over each character in the string.
-
#each_char(&block)
Yields each character in the string to the block.
-
#each_char_with_index(&block)
Yields each character and its index in the string to the block.
-
#each_codepoint
Returns an iterator for each codepoint.
-
#each_codepoint(&block)
Yields each codepoint to the block.
-
#each_line(&block)
Splits the string after each newline and yields each line to a block.
-
#each_line
Returns an
Iterator
which yields each line of this string (seeString#each_line
). -
#empty?
Returns true if this is the empty string,
""
. -
#encode(encoding : String, invalid : Symbol | Nil = nil) : Slice(UInt8)
Returns a slice of bytes containing this string encoded in the given encoding.
- #ends_with?(char : Char)
- #ends_with?(str : String)
-
#gsub(string : String, &block)
Returns a string where all occurrences of the given string are replaced with the block's value.
-
#gsub(&block : Char -> _)
Returns a string where each character yielded to the given block is replaced by the block's return value.
-
#gsub(char : Char, replacement)
Returns a string where all occurrences of the given char are replaced with the given replacement.
-
#gsub(string : String, replacement)
Returns a string where all occurrences of the given string are replaced with the given replacement.
-
#gsub(hash : Hash(Char, _))
Returns a string where all chars in the given hash are replaced by the corresponding hash values.
-
#gsub(pattern : Regex, hash : Hash(String, _))
Returns a string where all occurrences of the given pattern are replaced with a hash of replacements.
-
#gsub(pattern : Regex, replacement, backreferences = true)
Returns a string where all occurrences of the given pattern are replaced with the given replacement.
-
#gsub(pattern : Regex, &block)
Returns a string where all occurrences of the given pattern are replaced by the block value's value.
-
#has_back_references?
This returns true if this string has '\\' in it.
-
#hash
Returns a hash based on this string’s size and content.
-
#includes?(search : Char | String)
Returns true if the string contains search.
-
#index(search : String, offset = 0)
Returns the index of search in the string, or
nil
if the string is not present. -
#index(search : Char, offset = 0)
Returns the index of search in the string, or
nil
if the string is not present. -
#insert(index : Int, other : String)
Returns a new String that results of inserting other in self at index.
-
#insert(index : Int, other : Char)
Returns a new String that results of inserting other in self at index.
- #inspect(io)
- #inspect_unquoted(io)
- #inspect_unquoted
- #lines
-
#ljust(len, char : Char = ' ')
Adds instances of
char
to right of the string until it is at least size oflen
. -
#lstrip
Returns a new string with leading whitespace removed.
-
#match(regex : Regex, pos = 0, &block)
Searches the string for regex starting at pos, yielding the match if there is one.
-
#match(regex : Regex, pos = 0)
Finds match of regex, starting at pos.
-
#reverse
Reverses the order of characters in the string.
-
#rindex(search : String, offset = size - search.size)
Returns the index of the last appearance of c in the string, If
offset
is present, it defines the position to end the search (characters beyond this point are ignored). -
#rindex(search : Char, offset = size - 1)
Returns the index of the last appearance of c in the string, If
offset
is present, it defines the position to end the search (characters beyond this point are ignored). -
#rjust(len, char : Char = ' ')
Adds instances of
char
to left of the string until it is at least size oflen
. -
#rstrip
Returns a new string with trailing whitespace removed.
-
#scan(pattern : Regex)
Searches the string for instances of pattern, returning an array of
Regex::MatchData
for each match. -
#scan(pattern : String, &block)
Searches the string for instances of pattern, yielding the matched string for each match.
-
#scan(pattern : String)
Searches the string for instances of pattern, returning an array of the matched string for each match.
-
#scan(pattern : Regex, &block)
Searches the string for instances of pattern, yielding a
Regex::MatchData
for each match. -
#size
Returns the number of unicode codepoints in this string.
-
#split(separator : String, limit = nil)
Makes an array by splitting the string on separator (and removing instances of separator).
-
#split(separator : Regex, limit = nil)
Makes an array by splitting the string on separator (and removing instances of separator).
-
#split(limit : Int32 | Nil = nil)
Makes an array by splitting the string on any ASCII whitespace characters (and removing that whitespace).
-
#split(separator : Char, limit = nil)
Makes an array by splitting the string on the given character separator (and removing that character).
-
#squeeze(char : Char)
Returns a new string, with all runs of char replaced by one instance.
-
#squeeze(&block)
Yields each char in this string to the block.
-
#squeeze(*sets : String)
Sets should be a list of strings following the rules described at Char#in_set?.
-
#squeeze
Returns a new string, that has all characters removed, that were the same as the previous one.
- #starts_with?(str : String)
- #starts_with?(char : Char)
-
#strip
Returns a new string with leading and trailing whitespace removed.
-
#sub(pattern : Regex, hash : Hash(String, _))
Returns a string where the first occurrences of the given pattern is replaced with the matching entry from the hash of replacements.
-
#sub(char : Char, replacement)
Returns a string where the first occurrence of char is replaced by replacement.
-
#sub(&block : Char -> _)
Returns a new string where the first character is yielded to the given block and replaced by its return value.
-
#sub(pattern : Regex, &block)
Returns a string where the first occurrence of pattern is replaced by the block's return value.
-
#sub(string : String, &block)
Returns a string where the first occurrences of the given string is replaced with the block's value.
-
#sub(hash : Hash(Char, _))
Returns a string where the first char in the string matching a key in the given hash is replaced by the corresponding hash value.
-
#sub(string : String, replacement)
Returns a string where the first occurrences of the given string is replaced with the given replacement.
-
#sub(pattern : Regex, replacement, backreferences = true)
Returns a string where the first occurrence of pattern is replaced by replacement
-
#succ
Returns the successor of the string.
- #to_big_f
-
#to_big_i(base = 10) : BigInt
Returns a BigInt from this string, in the given base.
-
#to_f
Returns the result of interpreting leading characters in this string as a floating point number (
Float64
). -
#to_f32
Returns the result of interpreting leading characters in this string as a floating point number (
Float32
). -
#to_f64
Same as
#to_f
. -
#to_i(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
, but returns the block's value if there is not a valid number at the start of this string, or if the resulting integer doesn't fit an Int32. -
#to_i(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true)
Returns the result of interpreting leading characters in this string as an integer base base (between 2 and 36).
-
#to_i16(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int16
Same as
#to_i
but returns an Int16. -
#to_i16(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
but returns an Int16 or the block's value. -
#to_i16?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int16 | Nil
Same as
#to_i
but returns an Int16 or nil. -
#to_i32(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
. -
#to_i32(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int32
Same as
#to_i
. -
#to_i32?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int32 | Nil
Same as
#to_i
. -
#to_i64(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
but returns an Int64 or the block's value. -
#to_i64(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int64
Same as
#to_i
but returns an Int64. -
#to_i64?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int64 | Nil
Same as
#to_i
but returns an Int64 or nil. -
#to_i8(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int8
Same as
#to_i
but returns an Int8. -
#to_i8(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
but returns an Int8 or the block's value. -
#to_i8?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : Int8 | Nil
Same as
#to_i
but returns an Int8 or nil. -
#to_i?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true)
Same as
#to_i
, but returnsnil
if there is not a valid number at the start of this string, or if the resulting integer doesn't fit an Int32. - #to_json(io)
- #to_s(io)
- #to_s
- #to_slice
-
#to_u16(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
but returns an UInt16 or the block's value. -
#to_u16(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt16
Same as
#to_i
but returns an UInt16. -
#to_u16?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt16 | Nil
Same as
#to_i
but returns an UInt16 or nil. -
#to_u32(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
but returns an UInt32 or the block's value. -
#to_u32(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt32
Same as
#to_i
but returns an UInt32. -
#to_u32?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt32 | Nil
Same as
#to_i
but returns an UInt32 or nil. -
#to_u64(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
but returns an UInt64 or the block's value. -
#to_u64(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt64
Same as
#to_i
but returns an UInt64. -
#to_u64?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt64 | Nil
Same as
#to_i
but returns an UInt64 or nil. -
#to_u8(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true, &block)
Same as
#to_i
but returns an UInt8 or the block's value. -
#to_u8(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt8
Same as
#to_i
but returns an UInt8. -
#to_u8?(base : Int = 10, whitespace = true, underscore = false, prefix = false, strict = true) : UInt8 | Nil
Same as
#to_i
but returns an UInt8 or nil. - #to_unsafe
- #to_yaml(yaml : YAML::Generator)
-
#tr(from : String, to : String)
Returns a new string translating characters using from and to as a map.
-
#underscore
Converts camelcase boundaries to underscores.
- #unsafe_byte_at(index)
- #unsafe_byte_slice(byte_offset)
- #unsafe_byte_slice(byte_offset, count)
-
#upcase
Returns a new string with each lowercase letter replaced with its uppercase counterpart.
Macro Summary
Instance methods inherited from module Comparable(T)
<(other : T)
<,
<=(other : T)
<=,
<=>(other : T)
<=>,
==(other : T)
==,
>(other : T)
>,
>=(other : T)
>=
Instance methods inherited from class Reference
==(other)==(other : self) ==, hash hash, inspect(io : IO) : Nil inspect, object_id : UInt64 object_id, same?(other : Reference)
same?(other : Nil) same?, to_s(io : IO) : Nil to_s
Instance methods inherited from class Object
!=(other)
!=,
!~(other)
!~,
==(other)
==,
===(other)===(other : YAML::Any)
===(other : JSON::Any) ===, =~(other) =~, class class, clone clone, crystal_type_id crystal_type_id, dup dup, hash hash, inspect
inspect(io : IO) inspect, itself itself, not_nil! not_nil!, tap(&block) tap, to_json to_json, to_pretty_json(io : IO)
to_pretty_json to_pretty_json, to_s
to_s(io : IO) to_s, to_yaml(io : IO)
to_yaml to_yaml, try(&block) try
Class methods inherited from class Object
==(other : Class)
==,
===(other)
===,
cast(other) : self
cast,
from_json(string_or_io) : self
from_json,
from_yaml(string : String) : self
from_yaml,
hash
hash,
inspect(io)
inspect,
name : String
name,
to_s(io)
to_s,
|(other : U.class)
|
Class Method Detail
Builds a String by creating a String::Builder
with the given initial capacity, yielding
it to the block and finally getting a String out of it. The String::Builder
automatically
resizes as needed.
str = String.build do |str|
str << "hello "
str << 1
end
str # => "hello 1"
Creates a new String by allocating a buffer (Pointer(UInt8)
) with the given capacity, then
yielding that buffer. The block must return a tuple with the bytesize and size
(UTF-8 codepoints count) of the String. If the returned size is zero, the UTF-8 codepoints
count will be lazily computed.
This method is unsafe: the bytesize returned by the block must be less than the capacity given to this String. In the future this method might check that the returned bytesize is less or equal than the capacity, making it a safe method.
If you need to build a String where the maximum capacity is unknown, use String#build
.
str = String.new(4) do |buffer|
buffer[0] = 'a'.ord.to_u8
buffer[1] = 'b'.ord.to_u8
{2, 2}
end
str # => "ab"
Note: if the buffer doesn't end up denoting a valid UTF-8 sequence, this method still succeeds.
However, when iterating it or indexing it, an InvalidByteSequenceError
will be raised.
Creates a new String from a pointer, indicating its bytesize count and, optionally, the UTF-8 codepoints count (size). Bytes will be copied from the pointer.
If the given size is zero, the amount of UTF-8 codepoints will be lazily computed when needed.
ptr = Pointer.malloc(4) { |i| ('a'.ord + i).to_u8 }
String.new(ptr, 2) => "ab"
Note: if the chars don't denote a valid UTF-8 sequence, this method still succeeds.
However, when iterating it or indexing it, an InvalidByteSequenceError
will be raised.
Creates a new String from the given bytes, which are encoded in the given encoding.
The invalid argument can be:
nil
: an exception is raised on invalid byte sequences:skip
: invalid byte sequences are ignored
slice = Slice.new(2, 0_u8)
slice[0] = 186_u8
slice[1] = 195_u8
String.new(slice, "GB2312") # => "好"
Creates a String from a pointer. Bytes will be copied from the pointer.
This method is unsafe: the pointer must point to data that eventually contains a zero byte that indicates the ends of the string. Otherwise, the result of this method is undefined and might cause a segmentation fault.
This method is typically used in C bindings, where you get a char*
from a
library and the library guarantees that this pointer eventually has an
ending zero byte.
ptr = Pointer.malloc(5) { |i| i == 4 ? 0_u8 : ('a'.ord + i).to_u8 }
String.new(ptr) # => "abcd"
Note: if the chars don't denote a valid UTF-8 sequence, this method still succeeds.
However, when iterating it or indexing it, an InvalidByteSequenceError
will be raised.
Creates a String from the given slice. Bytes will be copied from the slice.
This method is always safe to call, and the resulting string will have the contents and size of the slice.
slice = Slice.new(4) { |i| ('a'.ord + i).to_u8 }
String.new(slice) # => "abcd"
Note: if the slice doesn't denote a valid UTF-8 sequence, this method still succeeds.
However, when iterating it or indexing it, an InvalidByteSequenceError
will be raised.
Instance Method Detail
Interpolates other into the string using Kernel#sprintf
"Party like it's %d!!!" % 1999 # => Party like it's 1999!!!
Makes a new string by adding str to itself times times.
"Developers! " * 4
# => "Developers! Developers! Developers! Developers!"
Concatenates str and other.
"abc" + "def" # => "abcdef"
"abc" + 'd' # => "abcd"
Concatenates str and other.
"abc" + "def" # => "abcdef"
"abc" + 'd' # => "abcd"
Compares this string with other, returning -1, 0 or +1 depending on whether this string is less, equal or greater than other.
Comparison is done byte-per-byte: if a byte is less then the other corresponding byte, -1 is returned and so on.
If the strings are of different lengths, and the strings are equal when compared up to the shortest length, then the longer string is considered greater than the shorter one.
"abcdef" <=> "abcde" # => 1
"abcdef" <=> "abcdef" # => 0
"abcdef" <=> "abcdefg" # => -1
"abcdef" <=> "ABCDEF" # => 1
Tests whether str matches regex.
If successful, it returns the position of the first match.
If unsuccessful, it returns nil
.
If the argument isn't a Regex
, it returns nil
.
"Haystack" =~ /ay/ # => 1
"Haystack" =~ /z/ # => nil
"Haystack" =~ 45 # => nil
Tests whether str matches regex.
If successful, it returns the position of the first match.
If unsuccessful, it returns nil
.
If the argument isn't a Regex
, it returns nil
.
"Haystack" =~ /ay/ # => 1
"Haystack" =~ /z/ # => nil
"Haystack" =~ 45 # => nil
Returns a substring by using a Range's begin and end as character indices. Indices can be negative to start counting from the end of the string.
Raises IndexError
if the range's start is not in range.
"hello"[0..2] # "hel"
"hello"[0...2] # "he"
"hello"[1..-1] # "ello"
"hello"[1...-1] # "ell"
Returns a substring starting from the start
character
of size #count
.
The start
argument can be negative to start counting
from the end of the string.
Raises IndexError
if start
isn't in range.
Raises ArgumentError
if #count
is negative.
Returns the Char
at the given index, or raises IndexError
if out of bounds.
Negative indices can be used to start counting from the end of the string.
"hello"[0] # 'h'
"hello"[1] # 'e'
"hello"[-1] # 'o'
"hello"[-2] # 'l'
"hello"[5] # raises IndexError
Returns the char index of a byte index, or nil if out of bounds.
It is valid to pass #bytesize
to index, and in this case the answer
will be the size of this string.
Returns this string's bytes as an Array(UInt8)
.
"hello".bytes # => [104, 101, 108, 108, 111]
"你好".bytes # => [228, 189, 160, 229, 165, 189]
Returns the number of bytes in this string.
"hello".bytesize # => 5
"你好".bytesize # => 6
Converts underscores to camelcase boundaries.
"eiffel_tower".camelcase # => "EiffelTower"
Returns a new string with the first letter converted to uppercase and every subsequent letter converted to lowercase.
"hEllO".capitalize # => "Hello"
Returns the byte index of a char index, or nil if out of bounds.
It is valid to pass #size
to index, and in this case the answer
will be the bytesize of this string.
"hello".char_index_to_byte_index(1) # => 1
"hello".char_index_to_byte_index(5) # => 5
"こんにちは".char_index_to_byte_index(1) # => 3
"こんにちは".char_index_to_byte_index(5) # => 15
Raises an ArgumentError
if self
has null bytes. Returns self
otherwise.
This method should sometimes be called before passing a String to a C function.
Returns a new String with char removed if the string ends with it.
"hello".chomp('o') # => "hell"
"hello".chomp('a') # => "hello"
Returns a new String with the last carriage return removed (that is, it will remove \n, \r, and \r\n).
"string\r\n".chomp # => "string"
"string\n\r".chomp # => "string\n"
"string\n".chomp # => "string"
"string".chomp # => "string"
"x".chomp.chmop # => "x"
See also: #chop
Returns a new String with str removed if the string ends with it.
"hello".chomp("llo") # => "he"
"hello".chomp("ol") # => "hello"
Returns a new String with the last character removed.
If the string ends with \r\n
, both characters are removed.
Applying chop to an empty string returns an empty string.
"string\r\n".chop # => "string"
"string\n\r".chop # => "string\n"
"string\n".chop # => "string"
"string".chop # => "strin"
"x".chop.chop # => ""
See also: #chomp
Returns an array of the codepoints that make the string. See Char#ord
"ab☃".codepoints # => [97, 98, 9731]
Compares this string with other, returning -1, 0 or +1 depending on whether this string is less, equal or greater than other, optionally in a case_insensitive manner.
If case_insitive if false
, this method delegates to #<=>
. Otherwise,
the strings are compared char-by-char, and ASCII characters are compared in a
case-insensitive way.
"abcdef".compare("abcde") # => 1
"abcdef".compare("abcdef") # => 0
"abcdef".compare("abcdefg") # => -1
"abcdef".compare("ABCDEF") # => 1
"abcdef".compare("ABCDEF", case_insensitive: true) # => 0
"abcdef".compare("ABCDEG", case_insensitive: true) # => -1
Yields each char in this string to the block, returns the number of times the block returned a truthy value.
"aabbcc".count { |c| ['a', 'b'].includes?(c) } # => 4
Counts the occurrences of other in this string.
"aabbcc".count('a') # => 2
Sets should be a list of strings following the rules described at Char#in_set?. Returns the number of characters in this string that match the given set.
Yields each char in this string to the block. Returns a new string with all characters for which the block returned a truthy value removed.
"aabbcc".delete { |c| ['a', 'b'].includes?(c) } # => "cc"
Returns a new string with all occurrences of char removed.
"aabbcc".delete('b') # => "aacc"
Sets should be a list of strings following the rules described at Char#in_set?. Returns a new string with all characters that match the given set removed.
"aabbccdd".delete("a-c") # => "dd"
Returns a new string with each uppercase letter replaced with its lowercase counterpart.
"hEllO".downcase # => "hello"
Returns an iterator over each byte in the string.
bytes = "ab☃".each_byte
bytes.next # => 97
bytes.next # => 98
bytes.next # => 226
bytes.next # => 156
bytes.next # => 131
Yields each byte in the string to the block.
"ab☃".each_byte do |byte|
byte # => 97, 98, 226, 152, 131
end
Returns an iterator over each character in the string.
chars = "ab☃".each_char
chars.next # => 'a'
chars.next # => 'b'
chars.next # => '☃'
Yields each character in the string to the block.
"ab☃".each_char do |char|
char # => 'a', 'b', '☃'
end
Yields each character and its index in the string to the block.
"ab☃".each_char_with_index do |char, index|
char # => 'a', 'b', '☃'
index # => 0, 1, 2
end
Returns an iterator for each codepoint. See Char#ord
codepoints = "ab☃".each_codepoint
codepoints.next # => 97
codepoints.next # => 98
codepoints.next # => 9731
Yields each codepoint to the block. See Char#ord
"ab☃".each_codepoint do |codepoint|
codepoint # => 97, 98, 9731
end
Splits the string after each newline and yields each line to a block.
haiku = "the first cold shower
even the monkey seems to want
a little coat of straw"
haiku.each_line do |stanza|
puts stanza.upcase
end
# => THE FIRST COLD SHOWER
# => EVEN THE MONKEY SEEMS TO want
# => A LITTLE COAT OF STRAW
Returns a slice of bytes containing this string encoded in the given encoding.
The invalid argument can be:
nil
: an exception is raised on invalid byte sequences:skip
: invalid byte sequences are ignored
"好".encode("GB2312") # => [186, 195]
"好".bytes # => [229, 165, 189]
Returns a string where all occurrences of the given string are replaced with the block's value.
"hello yellow".gsub("ll") { "dd" } # => "heddo yeddow"
Returns a string where each character yielded to the given block is replaced by the block's return value.
"hello".gsub { |x| (x.ord + 1).chr } # => "ifmmp"
"hello".gsub { "hi" } # => "hihihihihi"
Returns a string where all occurrences of the given char are replaced with the given replacement.
"hello".gsub('l', "lo") # => "heloloo"
"hello world".gsub('o', 'a') # => "hella warld"
Returns a string where all occurrences of the given string are replaced with the given replacement.
"hello yellow".gsub("ll", "dd") # => "heddo yeddow"
Returns a string where all chars in the given hash are replaced by the corresponding hash values.
"hello".gsub({'e' => 'a', 'l' => 'd'}) # => "haddo"
Returns a string where all occurrences of the given pattern are replaced with a hash of replacements. If the hash contains the matched pattern, the corresponding value is used as a replacement. Otherwise the match is not included in the returned string.
# "he" and "l" are matched and replaced,
# but "o" is not and so is not included
"hello".gsub(/(he|l|o)/, {"he": "ha", "l": "la"}) # => "halala"
Returns a string where all occurrences of the given pattern are replaced with the given replacement.
"hello".gsub(/[aeiou]/, '*') # => "h*ll*"
Within replacement, the special match variable $~
will not refer to the
current match.
If backreferences is true
(the default value), replacement can include backreferences:
"hello".gsub(/[aeiou]/, "(\\0)") # => "h(e)ll(o)"
When substitution is performed, any backreferences found in replacement
will be replaced with the contents of the corresponding capture group in
pattern. Backreferences to capture groups that were not present in
pattern or that did not match will be skipped. See Regex
for information
about capture groups.
Backreferences are expressed in the form "\\d"
, where d is a group
number, or "\\k<name>"
where name is the name of a named capture group.
A sequence of literal characters resembling a backreference can be
expressed by placing "\\"
before the sequence.
"foo".gsub(/o/, "x\\0x") # => "fxoxxox"
"foofoo".gsub(/(?<bar>oo)/, "|\\k<bar>|") # => "f|oo|f|oo|"
"foo".gsub(/o/, "\\\\0") # => "f\\0\\0"
Raises ArgumentError
if an incomplete named back-reference is present in
replacement.
Raises IndexError
if a named group referenced in replacement is not present
in pattern.
Returns a string where all occurrences of the given pattern are replaced by the block value's value.
"hello".gsub(/./) { |s| s[0].ord.to_s + ' ' } # => #=> "104 101 108 108 111 "
This returns true if this string has '\\' in it. It might not be a back reference, but '\\' is probably used for back references, so this check is faster than parsing the whole thing.
Returns true if the string contains search.
"Team".includes?('i') # => false
"Dysfunctional".includes?("fun") # => true
Returns the index of search in the string, or nil
if the string is not present.
If offset
is present, it defines the position to start the search.
"Hello, World".index('o') # => 4
"Hello, World".index('Z') # => nil
"Hello, World".index("o", 5) # => 8
"Hello, World".index("H", 2) # => nil
Returns the index of search in the string, or nil
if the string is not present.
If offset
is present, it defines the position to start the search.
"Hello, World".index('o') # => 4
"Hello, World".index('Z') # => nil
"Hello, World".index("o", 5) # => 8
"Hello, World".index("H", 2) # => nil
Returns a new String that results of inserting other in self at index. Negative indices count from the end of the string, and insert after the given index.
Raises IndexError
if the index is out of bounds.
"abcd".insert(0, "FOO") # => "FOOabcd"
"abcd".insert(3, "FOO") # => "abcFOOd"
"abcd".insert(4, "FOO") # => "abcdFOO"
"abcd".insert(-3, "FOO") # => "abFOOcd"
"abcd".insert(-1, "FOO") # => "abcdFOO"
"abcd".insert(5, "FOO") # raises IndexError
"abcd".insert(-6, "FOO") # raises IndexError
Returns a new String that results of inserting other in self at index. Negative indices count from the end of the string, and insert after the given index.
Raises IndexError
if the index is out of bounds.
"abcd".insert(0, 'X') # => "Xabcd"
"abcd".insert(3, 'X') # => "abcXd"
"abcd".insert(4, 'X') # => "abcdX"
"abcd".insert(-3, 'X') # => "abXcd"
"abcd".insert(-1, 'X') # => "abcdX"
"abcd".insert(5, 'X') # raises IndexError
"abcd".insert(-6, 'X') # raises IndexError
Adds instances of char
to right of the string until it is at least size of len
.
"Purple".ljust(8) # => "Purple "
"Purple".ljust(8, '-') # => "Purple--"
"Aubergine".ljust(8) # => "Aubergine"
Returns a new string with leading whitespace removed.
" hello ".strip # => "hello "
"\tgoodbye\r\n".strip # => "goodbye\r\n"
Searches the string for regex starting at pos, yielding the match if there is one.
"Pine".match(/P/) do |match|
puts match
end
# => #<Regex::MatchData "P">
"Oak".match(/P/) do |match|
# This is never invoked.
puts match
end
Reverses the order of characters in the string.
"Argentina".reverse # => "anitnegrA"
"racecar".reverse # => "racecar"
Returns the index of the last appearance of c in the string,
If offset
is present, it defines the position to end the search
(characters beyond this point are ignored).
"Hello, World".rindex('o') # => 8
"Hello, World".rindex('Z') # => nil
"Hello, World".rindex("o", 5) # => 4
"Hello, World".rindex("H", 2) # => nil
Returns the index of the last appearance of c in the string,
If offset
is present, it defines the position to end the search
(characters beyond this point are ignored).
"Hello, World".rindex('o') # => 8
"Hello, World".rindex('Z') # => nil
"Hello, World".rindex("o", 5) # => 4
"Hello, World".rindex("H", 2) # => nil
Adds instances of char
to left of the string until it is at least size of len
.
"Purple".ljust(8) # => " Purple"
"Purple".ljust(8, '-') # => "--Purple"
"Aubergine".ljust(8) # => "Aubergine"
Returns a new string with trailing whitespace removed.
" hello ".strip # => " hello"
"\tgoodbye\r\n".strip # => "\tgoodbye"
Searches the string for instances of pattern,
returning an array of Regex::MatchData
for each match.
Searches the string for instances of pattern, yielding the matched string for each match.
Searches the string for instances of pattern, returning an array of the matched string for each match.
Searches the string for instances of pattern, yielding a Regex::MatchData
for each match.
Returns the number of unicode codepoints in this string.
"hello".size # => 5
"你好".size # => 2
Makes an array by splitting the string on separator (and removing instances of separator).
If limit is present, the array will be limited to limit items and the final item will contain the remainder of the string.
If separator is an empty string (""
), the string will be separated into one-character strings.
long_river_name = "Mississippi"
long_river_name.split("ss") # => ["Mi", "i", "ippi"]
long_river_name.split("i") # => ["M", "ss", "ss", "pp"]
long_river_name.split("") # => ["M", "i", "s", "s", "i", "s", "s", "i", "p", "p", "i"]
Makes an array by splitting the string on separator (and removing instances of separator).
If limit is present, the array will be limited to limit items and the final item will contain the remainder of the string.
If separator is an empty regex (//
), the string will be separated into one-character strings.
long_river_name = "Mississippi"
long_river_name.split(/s+/) # => ["Mi", "i", "ippi"]
long_river_name.split(//) # => ["M", "i", "s", "s", "i", "s", "s", "i", "p", "p", "i"]
Makes an array by splitting the string on any ASCII whitespace characters (and removing that whitespace).
If limit is present, up to limit new strings will be created, with the entire remainder added to the last string.
old_pond = "
Old pond
a frog leaps in
water's sound
"
old_pond.split # => ["Old", "pond", "a", "frog", "leaps", "in", "water's", "sound"]
old_pond.split(3) # => ["Old", "pond", "a frog leaps in\n water's sound\n"]
Makes an array by splitting the string on the given character separator (and removing that character).
If limit is present, up to limit new strings will be created, with the entire remainder added to the last string.
"foo,bar,baz".split(',') # => ["foo", "bar", "baz"]
"foo,bar,baz".split(',', 2) # => ["foo", "bar,baz"]
Returns a new string, with all runs of char replaced by one instance.
"a bbb".squeeze(' ') # => "a bbb"
Yields each char in this string to the block. Returns a new string, that has all characters removed, that were the same as the previous one and for which the given block returned a truthy value.
"aaabbbccc".squeeze { |c| ['a', 'b'].includes?(c) } # => "abccc"
"aaabbbccc".squeeze { |c| ['a', 'c'].includes?(c) } # => "abbbc"
Sets should be a list of strings following the rules described at Char#in_set?. Returns a new string with all runs of the same character replaced by one instance, if they match the given set.
If no set is given, all characters are matched.
"aaabbbcccddd".squeeze("b-d") # => "aaabcd"
"a bbb".squeeze # => "a b"
Returns a new string, that has all characters removed, that were the same as the previous one.
"a bbb".squeeze # => "a b"
Returns a new string with leading and trailing whitespace removed.
" hello ".strip # => "hello"
"\tgoodbye\r\n".strip # => "goodbye"
Returns a string where the first occurrences of the given pattern is replaced with the matching entry from the hash of replacements. If the first match is not included in the hash, nothing is replaced.
"hello".sub(/(he|l|o)/, {"he": "ha", "l": "la"}) # => "hallo"
"hello".sub(/(he|l|o)/, {"l": "la"}) # => "hello"
Returns a string where the first occurrence of char is replaced by replacement.
"hello".sub('l', "lo") # => "helolo"
"hello world".sub('o', 'a') # => "hella world"
Returns a new string where the first character is yielded to the given block and replaced by its return value.
"hello".sub { |x| (x.ord + 1).chr } # => "iello"
"hello".sub { "hi" } # => "hiello"
Returns a string where the first occurrence of pattern is replaced by the block's return value.
"hello".sub(/./) { |s| s[0].ord.to_s + ' ' } # => "104 ello"
Returns a string where the first occurrences of the given string is replaced with the block's value.
"hello yellow".sub("ll") { "dd" } # => "heddo yellow"
Returns a string where the first char in the string matching a key in the given hash is replaced by the corresponding hash value.
"hello".sub({'a' => 'b', 'l' => 'd'}) # => "hedlo"
Returns a string where the first occurrences of the given string is replaced with the given replacement.
"hello yellow".sub("ll", "dd") # => "heddo yellow"
Returns a string where the first occurrence of pattern is replaced by replacement
"hello".sub(/[aeiou]/, "*") # => "h*llo"
Within replacement, the special match variable $~
will not refer to the
current match.
If backreferences is true
(the default value), replacement can include backreferences:
"hello".sub(/[aeiou]/, "(\\0)") # => "h(e)llo"
When substitution is performed, any backreferences found in replacement
will be replaced with the contents of the corresponding capture group in
pattern. Backreferences to capture groups that were not present in
pattern or that did not match will be skipped. See Regex
for information
about capture groups.
Backreferences are expressed in the form "\\d"
, where d is a group
number, or "\\k<name>"
where name is the name of a named capture group.
A sequence of literal characters resembling a backreference can be
expressed by placing "\\"
before the sequence.
"foo".sub(/o/, "x\\0x") # => "fxoxo"
"foofoo".sub(/(?<bar>oo)/, "|\\k<bar>|") # => "f|oo|foo"
"foo".sub(/o/, "\\\\0") # => "f\\0o"
Raises ArgumentError
if an incomplete named back-reference is present in
replacement.
Raises IndexError
if a named group referenced in replacement is not present
in pattern.
Returns the successor of the string. The successor is calculated by incrementing characters starting from the rightmost alphanumeric (or the rightmost character if there are no alphanumerics) in the string. Incrementing a digit always results in another digit, and incrementing a letter results in another letter of the same case.
If the increment generates a “carry”, the character to the left of it is incremented. This process repeats until there is no carry, adding an additional character if necessary.
"abcd".succ # => "abce"
"THX1138".succ # => "THX1139"
"((koala))".succ # => "((koalb))"
"1999zzz".succ # => "2000aaa"
"ZZZ9999".succ # => "AAAA0000"
"***".succ # => "**+"
Returns a BigInt from this string, in the given base.
Raises ArgumentError
if this string doesn't denote a valid integer.
Returns the result of interpreting leading characters in this string as a floating point number (Float64
).
Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str,
- 0 is returned. This method never raises an exception.
"123.45e1".to_f # => 1234.5
"45.67 degrees".to_f # => 45.67
"thx1138".to_f # => 0.0
Returns the result of interpreting leading characters in this string as a floating point number (Float32
).
Extraneous characters past the end of a valid number are ignored. If there is not a valid number at the start of str,
- 0 is returned. This method never raises an exception.
See #to_f
.
Same as #to_i
, but returns the block's value if there is not a valid number at the start
of this string, or if the resulting integer doesn't fit an Int32.
"12345".to_i { 0 } # => 12345
"hello".to_i { 0 } # => 0
Returns the result of interpreting leading characters in this string as an integer base base (between 2 and 36).
If there is not a valid number at the start of this string, or if the resulting integer doesn't fit an Int32, an ArgumentError is raised.
Options:
- whitespace: if true, leading and trailing whitespaces are allowed
- underscore: if true, underscores in numbers are allowed
- prefix: if true, the prefixes "0x", "0" and "0b" override the base
- strict: if true, extraneous characters past the end of the number are disallowed
"12345".to_i # => 12345
"0a".to_i # => 0
"hello".to_i # => raises
"0a".to_i(16) # => 10
"1100101".to_i(2) # => 101
"1100101".to_i(8) # => 294977
"1100101".to_i(10) # => 1100101
"1100101".to_i(base: 16) # => 17826049
"12_345".to_i # => raises
"12_345".to_i(underscore: true) # => 12345
" 12345 ".to_i # => 12345
" 12345 ".to_i(whitepsace: false) # => raises
"0x123abc".to_i # => raises
"0x123abc".to_i(prefix: true) # => 1194684
"99 red balloons".to_i # => raises
"99 red balloons".to_i(strict: false) # => 99
Same as #to_i
but returns an Int16.
Same as #to_i
but returns an Int16 or the block's value.
Same as #to_i
but returns an Int16 or nil.
Same as #to_i
.
Same as #to_i
.
Same as #to_i
.
Same as #to_i
but returns an Int64 or the block's value.
Same as #to_i
but returns an Int64.
Same as #to_i
but returns an Int64 or nil.
Same as #to_i
but returns an Int8.
Same as #to_i
but returns an Int8 or the block's value.
Same as #to_i
but returns an Int8 or nil.
Same as #to_i
, but returns nil
if there is not a valid number at the start
of this string, or if the resulting integer doesn't fit an Int32.
"12345".to_i? # => 12345
"99 red balloons".to_i? # => 99
"0a".to_i? # => 0
"hello".to_i? # => nil
Same as #to_i
but returns an UInt16 or the block's value.
Same as #to_i
but returns an UInt16.
Same as #to_i
but returns an UInt16 or nil.
Same as #to_i
but returns an UInt32 or the block's value.
Same as #to_i
but returns an UInt32.
Same as #to_i
but returns an UInt32 or nil.
Same as #to_i
but returns an UInt64 or the block's value.
Same as #to_i
but returns an UInt64.
Same as #to_i
but returns an UInt64 or nil.
Same as #to_i
but returns an UInt8 or the block's value.
Same as #to_i
but returns an UInt8.
Same as #to_i
but returns an UInt8 or nil.
Returns a new string translating characters using from and to as a map. If to is shorter than from, the last character in to is used for the rest.
"aabbcc".tr("abc", "xyz") # => "xxyyzz"
"aabbcc".tr("abc", "x") # => "xxxxxx"
"aabbcc".tr("a", "xyz") # => "xxbbcc"
Converts camelcase boundaries to underscores.
"DoesWhatItSaysOnTheTin".underscore # => "does_what_it_says_on_the_tin"
"PartyInTheUSA".underscore # => "party_in_the_usa"
"HTTP_CLIENT".underscore # => "http_client"
Returns a new string with each lowercase letter replaced with its uppercase counterpart.
"hEllO".upcase # => "HELLO"