3. Data types and literals
HSL has multiple data types; strings, booleans, numbers, arrays (which also works as an ordered map to store key-value pairs, similar to PHP’s array), objects (created by classes) and functions (both anonymous functions and named function pointers). Some of these data types may be represented as literals. There is also a none (or null) data type that is often used to represent errors or no valid value or response (e.g. a return statement without a value or a failed json_decode()
both of which return none).
3.1. Boolean
The keywords true
and false
represent boolean true and boolean false, they are treated as 1 and 0 in arithmetics operations.
Warning
Boolean true
and false
should not always be used in if statement, if you are not fully aware of the truthiness and loose comparison.
if (5 == true) { } // false: 5 is not equal to 1
if (5) { } // true: 5 is not false, hence true
3.2. Number
The number type is a double-precision 64-bit IEEE 754 value. If converted to a string it will be presented in the most accurate way possible without trailing decimal zeros. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.
echo 1.0; // 1
echo 1_000_000; // 1000000
Note
The number type can safely represent all integers between +/-9007199254740991 (the equivalent of (2 ** 53) - 1
).
Warning
After some arithmetic operations on floating point numbers; the equality (==) of two floating point numbers may not be true even if they mathematically “should”. This caveat is not unique to HSL, instead it is the result of how computers calculates and stores floating point numbers. Arithmetic operations on numbers without decimals are not affected.
3.2.1. Hexadecimal
Numbers may be entered in hexadecimal form (also known as base 16) using the 0x
prefix; followed by [0-9a-f]+
. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.
echo 0xfa; // 250
echo 0x00_fa; // 250
3.2.2. Octal
Numbers may be entered in octal form (also known as base 8) using the 0o
prefix; followed by [0-7]+
. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.
echo 0o372; // 250
3.2.3. Binary
Numbers may be entered in binary form (also known as base 2) using the 0b
prefix; followed by [0-1]+
. A numeric separator (_) is allowed between digits for readability, it does not affect the value of the number.
echo 0b11111010; // 250
echo 0b1111_1010; // 250
3.3. String
There are two kinds of string literals, double-quoted strings and raw strings. Double-quoted strings support language features such as variable interpolation and escape sequences. Most functions (e.g. length()
and str_slice()
) are not UTF-8 aware, with the exception of regular expression matching (e.g. pcre_match()
) which may be configured to be UTF-8 aware with the /u modifier.
3.3.1. Double-quoted string
Variable interpolation replaces $variable
placeholders within string literals. Variables are matched in strings with the following pattern $[a-zA-Z_]+[a-zA-Z0-9_]
. If needed there is also a more explicit syntax ${variable}
(which allows variables mid-words). Interpolating an undeclared variable raises a runtime error.
"$variable"
"${variable}abc"
Escape sequence |
Meaning |
---|---|
|
Backslash ( |
|
Double quote ( |
|
Dollar sign ( |
|
ASCII Linefeed (LF) |
|
ASCII Carriage Return (CR) |
|
ASCII Horizontal Tab (TAB) |
|
Character with hex value hh |
3.3.2. Raw string
Raw strings do not support variable interpolation nor escape sequences. This make them suitable for regular expressions. Raw strings start and end with two single quotes on each side ''
, with an optional delimiter in between. The delimiter can be any of [\x21-\x26\x28-\x7e]*
; simply put any word.
''raw string''
'DELIMITER'raw string'DELIMITER'
'#'raw string'#'
Note
There is no performance difference between double-quoted and raw strings containing the same value. However if special characters needs to be escaped then raw string are recommended for clarity.
3.4. Regex
A regex literal is a pre-compiled regular expression object. The regular expression implementation is “Perl Compatible” (hence the function names pcre_…), for syntax see the perlre documentation. See supported pattern modifiers.
#/pattern/[modifiers]
This type can mainly be used with the regular expression operators and also as argument to the pcre_match()
family of functions.
if ($string =~ #/hello/i) {}
if (pcre_match(#/hello/i, $string)) {}
3.4.1. Pattern modifiers
Use pattern modifiers to change the behavior of the pattern engine, they have the capability to make the match case-insensitive and activate UTF-8 support (where one UTF-8 characters may be matched using only one dot) etc.
Modifier |
Internal define |
Description |
---|---|---|
i |
PCRE_CASELESS |
Do case-insensitive matching |
m |
PCRE_MULTILINE |
See perl documentation |
u |
PCRE_UTF8 |
Enable UTF-8 support |
s |
PCRE_DOTALL |
See perl documentation |
x |
PCRE_EXTENDED |
See perl documentation |
U |
PCRE_UNGREEDY |
See perl documentation |
X |
PCRE_EXTRA |
See perl documentation |
3.5. Array
An array is a very useful container; it can act as an indexed array (automatically indexed at zero, or the highest current index + 1) or as an ordered map (associative array) with any and mixed data types as key and value. The short array syntax for literal arrays []
is recommended.
// indexed arrays
echo array("value", "value2");
echo ["value", "value2"];
echo [0 => "value", 1 => "value2"];
// associative arrays
echo array("key" => "value");
echo ["key" => "value"];
// multidimensional arrays
echo ["key" => ["key" => "value"]];
// automatic indexing
echo ["foo", 3=>"bar", "baz"]; // 0=>foo, 3=>bar, 4=>baz
// delete index
$foo = [];
$foo["bar"] = "hello";
unset($foo["bar"]);
echo $foo; // []
Note
Accessing any element in a zero indexed array using the subscript or slice operator is very fast (it has the complexity of O(1)).
3.6. Function
Both anonymous functions (closures) and named function pointers (references to functions) are available. This datatype is primarly used to be passed as callbacks to other functions.
3.6.1. Anonymous functions
An anonymous function is a unnamed function, it can be passed as value to a function or assigned to a variable. An anonymous function can also act as a closure. The global variable scoping rules apply.
$multiply = function ($x, $y) { return $x * $y; };
echo $multiply(3, 5); // 15
3.6.2. Named function pointers
A named function pointer is a reference to a named function. It can reference both a builtin function or a user-defined function. Prepending the function name with the builtin keyword works as expected.
function length($str) { return 42; }
$function = length;
echo $function("Hello"); // 42
$function = builtin length;
echo $function("Hello"); // 5
3.7. Object
An object is an instance type of a class statement or of a builtin class (such as Socket
or File
).
- class Iterator
The builtin iterator class is used for iterators such as generators or
Map
.- next()
Return the next value from the iterator.
- Returns
iterator data
- Return type
array
The iterator data returned is an associative array with two fields
value
(any) the value of the iteratordone
(boolean) if the iteration/iterator is completed.
Note
Some iterators (such as
Map
) return an array containing a [key, value] asvalue
. The foreach statement will then map these to its key and value variables.
3.8. None
This data type is represented by the keyword none
. It may be used to indicate error-result or no return value from functions such as. json_decode()
(in case of a decode error) or from a user-defined function with no or an empty return statement. This data type should not be used as an argument to other built-in functions as it yields undefined behavior for the most part. The only functions safe to handle this data type is:
$obj = json_decode("...");
if ($obj == none)
echo "None";