struct Char

Overview

A Char represents a Unicode code point. It occupies 32 bits.

It is created by enclosing an UTF-8 character in single quotes.

'a'
'z'
'0'
'_'
'あ'

You can use a backslash to denote some characters:

'\'' # single quote
'\\' # backslash
'\e' # escape
'\f' # form feed
'\n' # newline
'\r' # carriage return
'\t' # tab
'\v' # vertical tab

You can use a backslash followed by at most three digits to denote a code point written in octal:

'\101' # == 'A'
'\123' # == 'S'
'\12'  # == '\n'
'\1'   # code point 1

You can use a backslash followed by an u and four hexadecimal characters to denote a unicode codepoint written:

'\u0041' # == 'A'

Or you can use curly braces and specify up to four hexadecimal numbers:

'\u{41}' # == 'A'

Included Modules

Comparable(Char)

Defined in:

primitives.cr
char.cr
char/reader.cr

Constant Summary

ZERO = '\u{0}': The character representing the end of a C string.

Instance Method Summary

#!=(other : Char) : Bool
Returns true if self's codepoint is not equal to other's codepoint.
#+(str : String)
Concatenates this char and string.
#-(other : Char)
Returns the difference of the codepoint values of this char and other.
#<(other : Char) : Bool
Returns true if self's codepoint is less than other's codepoint.
#<=(other : Char) : Bool
Returns true if self's codepoint is less than or equal to other's codepoint.
#<=>(other : Char)
Implements the comparison operator.
#==(other : Char) : Bool
Returns true if self's codepoint is equal to other's codepoint.
#===(byte : Int)
#>(other : Char) : Bool
Returns true if self's codepoint is greater than other's codepoint.
#>=(other : Char) : Bool
Returns true if self's codepoint is greater than or equal to other's codepoint.
#alpha?
Returns true if this char is an ASCII letter ('a' to 'z', 'A' to 'Z').
#alphanumeric?
Returns true if this char is an ASCII letter or digit ('0' to '9', 'a' to 'z', 'A' to 'Z').
#bytes
Returns this Char bytes as encoded by UTF-8, as an Array(UInt8).
#bytesize
Returns the number of UTF-8 bytes in this char.
#control?
Returns true if this char is an ASCII control character.
#digit?
Returns true if this char is an ASCII digit ('0' to '9').
#downcase
Returns the ASCII downcase equivalent of this char.
#dump(io)
Appends this Char as a String that contains a char literal as written in Crystal to the given IO.
#dump
Returns this Char as a String that contains a char literal as written in Crystal, with characters with a codepoint greater than 0x79 written as \u{...}.
#each_byte(&block)
Yields each of the bytes of this Char as encoded by UTF-8.
#hash
Returns this char's codepoint.
#in_set?(*sets : String)
Returns true if this char is matched by the given sets.
#inspect
Returns this Char as a String that contains a char literal as written in Crystal.
#inspect(io)
Appends this Char as a String that contains a char literal as written in Crystal to the given IO.
#lowercase?
Returns true if this char is a lowercase ASCII letter.
#ord : Int32
Returns the codepoint of this char.
#pred
Returns a Char that is one codepoint smaller than this char's codepoint.
#succ
Returns a Char that is one codepoint bigger than this char's codepoint.
#to_i(&block)
Returns the integer value of this char if it's an ASCII char denoting a digit, otherwise the value returned by the block.
#to_i(base, &block)
Returns the integer value of this char if it's an ASCII char denoting a digit in the given base, otherwise the value return by the block.
#to_i(base, or_else = 0)
Returns the integer value of this char if it's an ASCII char denoting a digit in the given base, otherwise the value of or_else.
#to_i
Returns the integer value of this char if it's an ASCII char denoting a digit, 0 otherwise.
#to_s
Returns this Char as a String containing this Char as a single character.
#to_s(io : IO)
Appends this Char to the given IO.
#upcase
Returns the ASCII upcase equivalent of this char.
#uppercase?
Returns true if this char is an uppercase ASCII letter.
#whitespace?
Returns true if this char is an ASCII whitespace.

Instance methods inherited from module `Comparable(T)`

, , , , ,

Instance methods inherited from struct `Value`

Instance methods inherited from class `Object`

, , , , , , , , , , , , , , , , , ,

Class methods inherited from class `Object`

, , , , , , , , ,

Instance Method Detail

def !=(other : Char) : Bool #

Returns true if self's codepoint is not equal to other's codepoint.

[View source]

def +(str : String) #

Concatenates this char and string.

'f' + "oo" # => "foo"

[View source]

def -(other : Char) #

Returns the difference of the codepoint values of this char and other.

'a' - 'a' # => 0
'b' - 'a' # => 1
'c' - 'a' # => 2

[View source]

def <(other : Char) : Bool #

Returns true if self's codepoint is less than other's codepoint.

[View source]

def <=(other : Char) : Bool #

Returns true if self's codepoint is less than or equal to other's codepoint.

[View source]

def <=>(other : Char) #

Implements the comparison operator.

'a' <=> 'c' # => -2

[View source]

def ==(other : Char) : Bool #

Returns true if self's codepoint is equal to other's codepoint.

[View source]

def ===(byte : Int) #

[View source]

def >(other : Char) : Bool #

Returns true if self's codepoint is greater than other's codepoint.

[View source]

def >=(other : Char) : Bool #

Returns true if self's codepoint is greater than or equal to other's codepoint.

[View source]

def alpha? #

Returns true if this char is an ASCII letter ('a' to 'z', 'A' to 'Z').

'c'.alpha? # => true
'8'.alpha? # => false

[View source]

def alphanumeric? #

Returns true if this char is an ASCII letter or digit ('0' to '9', 'a' to 'z', 'A' to 'Z').

'c'.alphanumeric? # => true
'8'.alphanumeric? # => true
'.'.alphanumeric? # => false

[View source]

def bytes #

Returns this Char bytes as encoded by UTF-8, as an Array(UInt8).

'a'.bytes # => [97]
'あ'.bytes # => [227, 129, 130]

[View source]

def bytesize #

Returns the number of UTF-8 bytes in this char.

'a'.bytesize # => 1
'好'.bytesize # => 3

[View source]

def control? #

Returns true if this char is an ASCII control character.

('\u0000'..'\u0019').each do |char|
  char.control? # => true
end

('\u007F'..'\u009F').each do |char|
  char.control? # => true
end

# false in every other case

[View source]

def digit? #

Returns true if this char is an ASCII digit ('0' to '9').

'4'.digit? # => true
'z'.digit? # => false

[View source]

def downcase #

Returns the ASCII downcase equivalent of this char.

'Z'.downcase # => 'z'
'x'.downcase # => 'x'
'.'.downcase # => '.'

[View source]

def dump(io) #

Appends this Char as a String that contains a char literal as written in Crystal to the given IO.

See #dump.

[View source]

def dump #

Returns this Char as a String that contains a char literal as written in Crystal, with characters with a codepoint greater than 0x79 written as \u{...}.

'a'.dump      # => "'a'"
'\t'.dump     # => "'\t'"
'あ'.dump      # => "'\u{3042}'"
'\u0012'.dump # => "'\u{12}'"

[View source]

def each_byte(&block) #

Yields each of the bytes of this Char as encoded by UTF-8.

puts "'a'"
'a'.each_byte do |byte|
  puts byte
end
puts

puts "'あ'"
'あ'.each_byte do |byte|
  puts byte
end

Output:

'a'
97

'あ'
227
129
130

[View source]

def hash #

Returns this char's codepoint.

[View source]

def in_set?(*sets : String) #

Returns true if this char is matched by the given sets.

Each parameter defines a set, the character is matched against the intersection of those, in other words it needs to match all sets.

If a set starts with a ^, it is negated. The sequence c1-c2 means all characters between and including c1 and c2 and is known as a range.

The backslash character \ can be used to escape ^ or - and is otherwise ignored unless it appears at the end of a range or the end of a a set.

'l'.in_set? "lo"          # => true
'l'.in_set? "lo", "o"     # => false
'l'.in_set? "hello", "^l" # => false
'l'.in_set? "j-m"         # => true

'^'.in_set? "\\^aeiou" # => true
'-'.in_set? "a\\-eo"   # => true

'\\'.in_set? "\\"    # => true
'\\'.in_set? "\\A"   # => false
'\\'.in_set? "X-\\w" # => true

[View source]

def inspect #

Returns this Char as a String that contains a char literal as written in Crystal.

'a'.inspect      # => "'a'"
'\t'.inspect     # => "'\t'"
'あ'.inspect      # => "'あ'"
'\u0012'.inspect # => "'\u{12}'"

[View source]

def inspect(io) #

Appends this Char as a String that contains a char literal as written in Crystal to the given IO.

See #inspect.

[View source]

def lowercase? #

Returns true if this char is a lowercase ASCII letter.

'c'.lowercase? # => true
'G'.lowercase? # => false
'.'.lowercase? # => false

[View source]

def ord : Int32 #

Returns the codepoint of this char.

The codepoint is the integer representation. The Universal Coded Character Set (UCS) standard, commonly known as Unicode, assigns names and meanings to numbers, these numbers are called codepoints.

For values below and including 127 this matches the ASCII codes and thus its byte representation.

'a'.ord      # => 97
'\0'.ord     # => 0
'\u007f'.ord # => 127
'☃'.ord      # => 9731

[View source]

def pred #

Returns a Char that is one codepoint smaller than this char's codepoint.

'b'.pred # => 'a'
'ぃ'.pred # => 'あ'

[View source]

def succ #

Returns a Char that is one codepoint bigger than this char's codepoint.

'a'.succ # => 'b'
'あ'.succ # => 'ぃ'

This method allows creating a Range of chars.

[View source]

def to_i(&block) #

Returns the integer value of this char if it's an ASCII char denoting a digit, otherwise the value returned by the block.

'1'.to_i { 10 } # => 1
'8'.to_i { 10 } # => 8
'c'.to_i { 10 } # => 10

[View source]

def to_i(base, &block) #

Returns the integer value of this char if it's an ASCII char denoting a digit in the given base, otherwise the value return by the block.

'1'.to_i(16) { 20 } # => 1
'a'.to_i(16) { 20 } # => 10
'f'.to_i(16) { 20 } # => 15
'z'.to_i(16) { 20 } # => 20

[View source]

def to_i(base, or_else = 0) #

Returns the integer value of this char if it's an ASCII char denoting a digit in the given base, otherwise the value of or_else.

'1'.to_i(16)     # => 1
'a'.to_i(16)     # => 10
'f'.to_i(16)     # => 15
'z'.to_i(16)     # => 0
'z'.to_i(16, 20) # => 20

[View source]

def to_i #

Returns the integer value of this char if it's an ASCII char denoting a digit, 0 otherwise.

'1'.to_i # => 1
'8'.to_i # => 8
'c'.to_i # => 0

[View source]

def to_s #

Returns this Char as a String containing this Char as a single character.

'a'.to_s # => "a"
'あ'.to_s # => "あ"

[View source]

def to_s(io : IO) #

Appends this Char to the given IO. This appends this Char's bytes as encoded by UTF-8 to the given IO.

[View source]

def upcase #

Returns the ASCII upcase equivalent of this char.

'z'.upcase # => 'Z'
'X'.upcase # => 'X'
'.'.upcase # => '.'

[View source]

def uppercase? #

Returns true if this char is an uppercase ASCII letter.

'H'.uppercase? # => true
'c'.uppercase? # => false
'.'.uppercase? # => false

[View source]

def whitespace? #

Returns true if this char is an ASCII whitespace.

' '.whitespace?  # => true
'\t'.whitespace? # => true
'b'.whitespace?  # => false

[View source]

struct Char

Overview

Included Modules

Defined in:

Constant Summary

Instance Method Summary

Instance methods inherited from module Comparable(T)

Instance methods inherited from struct Value

Instance methods inherited from class Object

Class methods inherited from class Object

Instance Method Detail

Instance methods inherited from module `Comparable(T)`

Instance methods inherited from struct `Value`

Instance methods inherited from class `Object`

Class methods inherited from class `Object`