Falcon Wiki - Survival:Basic Structures

Home
<<Back

Basic Datatypes

Arrays

The most important basic data structure is the array. An array is a list of items, in which any element may be accessed by an index. Items can also be added, removed or changed.

Arrays are defined using the [] parenthesis. Each item is separated by the other by commas, and may be of any kind, including the result of expressions:

array = ["person", 1, 3.5, int( "123" ), var1 ]

Actually the square brackets are optional; if the list is short, the array may be declared without parenthesis:

array = "person", 1, 3.5, int( "123" ), var1

But when using this method it is not possible to spread the list on multiple lines without using the backslash. Compare:

array = [ "person",
         1,
         3.5,
         int( "123" ),
         var1
       ]

array = "person", \
       1, \
       3.5, \
       int( "123" ), \
       var1

These two statements do the same thing, but the first is less confusing.

A list may be immediately assigned to a literal list of symbols to "expand it". This code:

a, b, c = 1, 2, 3

Will cause 1 to be stored in a, 2 to be stored in b, and 3 in c. More interestingly:

array = 1, 2, 3     // a regular array declaration
/* Some code here */
a, b, c = array     // the array's contents get copied to single variables

This will accomplish the same thing, but having the items packed in one variable makes it easier to carry them around. For example, you may return multiple values from a function and unpack them into a set of target variables. If the size of the list is different from the target set, the compiler (or the VM if the compiler cannot see this at compile time) will raise an error.

An item may be accessed by the [] operator. Each item is numbered from 0 to the size of the array -1. For example, you may traverse an array with a for loop like this:

var1 = "something"
array = [ "person", 1, 3.5, int( "123" ), var1 ]
i = 0
while i < len( array )
   printl( "Element ", i,": ", array[i] )
   i++
end

The function len() will return the number of items in the array. Array items also provide a method called len which allows extraction of the length of the array through the "dot" object access operator; the above line may be rewritten as:

while i < array.len(): > "Element ", i,": ", array[i++]

A single item in an array may be modified the same way by assigning something else to it:

array[3] = 10

An element of an array may be any kind of Falcon item, including arrays. So it is perfectly legal to nest arrays like this:

array = [ [1,2], [2,3], [3,4] ]

Then, array[0] will contain the array [1, 2]; array[0][0] will contain the number 1 from the [1,2] array.

Array indexes can be negative; a negative index means "distance from the end", -1 being the last element, -2 the element before the last and so on. So

array[0] == array[ - len(array) ]

always holds true (with a list that has at least one element).

Trying to access an item outside the array boundaries will cause a runtime error; this runtime error can be prevented by preventively checking the array size and the type of the expression we are using to access the array, or it can be intercepted as we'll see later.

It is possible to access more than one item at a time; a particular expression called "range" can be used to access arrays and extract or alter parts of them. A range is defined as a pair of integers so that R=[n : m] means "all items from n to m-1". The higher index is exclusive, that is, excludes the element before the specified index, for a reason that will be clear below. The high end of the range may be open, having the meaning "up to the end of the array". As the beginning of the array is always 0, an open range starting from zero will include all elements of the array (and possibly none). The following shows how a range is used:

var1 = "something"
list = [ "person", 1, 3.5, int( "123" ), var1 ]
list1 = list[2:4]   // 3.5, int( "123" )
list2 = list[2:]    // 3.5, int( "123" ), "something"
list3 = list[0:3]   // "person", 1, 3.5
list4 = list[0:]    // "person", 1, 3.5, int( "123" ), "something"
list5 = list[:]     // "person", 1, 3.5, int( "123" ), "something"

A range can contain negative indexes. Negative indexes means "distance from end", -1 being the last item:

list1 = list[-2:-1]  // the element before the last
list2 = list[-4:]    // the last 4 elements.
list3 = list[-1:]    // the last element.

Finally, an array can have a range with the first number being greater than the last one; in this special case the last index is inclusive (note that the last element is counted in the resulting list). This produces a reverse sequence:

list1 = list[3:0]   // the first 4 elements in reverse order
list2 = list[4:2]   // elements 4, 3 and 2 in this order
list3 = list[-1:4]  // from the last element to the 4th
list4 = list[-1:0]  // the whole array reversed.

Don't be confused about the fact that negative numbers are "usually" smaller than positive ones. A negative array index means the end of the array -x, which may be smaller or greater than a positive index. In an array with 10 elements, the element -4 is greater than the 4 (10-4 = 6), while in an array of 6 elements, -4 is smaller than 4 ( 6-4 = 2 ).

Ranges can be independently assigned to a variable and then used as indexes at a later time:

if a < 5
   rng = [a:5]
else
   rng = [5:a]
end
array1 = array[rng]

Of course, both the array indexes and the range indexes may be a result from any kind of expression, provided that expression evaluates to a number.

To access the beginning or the end of a range, you may use the array accessors; the index 0 is the first element, and the index 1 (or -1) is the last. If the range is open, the value of the last element will be nil.

rng = [1:5]
printl( "Start: ", rng[0], ";   End: ", rng[1] )
rng = [1:]
printl( "Will print nil: ", rng[1] )
It is possible to assign items to array ranges:
b, c = 2, 3
list[0:2] = b       // removes items 0 and 1, and adds b in their place
list[1:1] = c       // inserts c at position 1.
list[1] = []        // puts an empty array in place of element 1
list[1:2] = []      // removes item 1, reducing the array size.

As the last two rows of this example demonstrates, assigning a list into an array range causes all the original items to be changed with the new list ones; they may be less, more or the same than the original ones. In particular, assigning an empty list to a range causes the destruction of all the items in the range without replacing them.

The fact that the end index is not inclusive allows for item insertion when using a range that does not include any items: [0:0] mean "inserts some item at place 0", while [0:1] indicates exactly the first item.

To extend a list it is possible to use the plus operator "+" or the self assignment operator:

a = [ 1, 2 ]
b = [ 3, 4 ]
c = a + b         // c = [1, 2, 3, 4]
c += b            // c = [1, 2, 3, 4, 3, 4]
c += "data"       // c = [1, 2, 3, 4, 3, 4, "data"]
a += [[]]           // a = [1, 2, [] ]
a[2] += ["data"]  // a = [1, 2, ["data"] ]

To remove selectively elements from an array, it is possible to use the "-" (minus) operator. Nothing is done if trying to remove an item that is not contained in the array:

a = [ 1, 2, 3, 4, "alpha", "beta" ]
b = a - 2                         // b = [ 1, 3, 4, "alpha", "beta" ]
c = a - [ 1, "alpha" ]            // c = [ 2, 3, 4, "beta" ]
c -= 2                            // c = [ 3, 4, "beta" ]
a -= c                            // a = [ 1, 2, "alpha"]
a -= "no item"                    //  a is unchanged; no effect

Array manipulation functions

Falcon provides a set of powerful functions that complete the support for arrays. A preallocated buffer containing all nil elements can be created with the arrayBuffer function:

arr = arrayBuffer(4)
arr[0] = 0
arr[1] = 1
arr[2] = 2
arr[3] = 3
inspect( arr )

This prevents unneeded resizing of the array when its dimension is known in advance.

To access the first or last element of an array, for example, in loops, arrayHead and arrayTail functions can be used. They retrieve and then remove the first or last element of the array. For example, to pop the last element of an array:

arr = [ "a", "b", "c", "d" ]
while arr.len() > 0
   > "Popping from back... ", arrayTail( arr )
end

It is possible to remove an arbitrary element with the arrayRemove function, which must be given the array to work on and the index (eventually negative to count from the end). More flexible are the arrayDel and arrayDelAll functions. The former removes the first element matching a given value; the latter removes all the matching elements:

a = [ 1, 2, "alpha", 4, "alpha", "beta" ]
arrayDelAll( a, "alpha" )
inspect( a )   // "alpha" has been removed

The filter function is a bit more complete (and we'll see more about the filter() functional construct later on). This function calls a given function providing it with one element at a time; if the function returns true, the given element is added to a final array, otherwise it is skipped. We haven't introduced the functions yet, so just take the following example as-is:

function passEven( item )
   return item.typeId() == NumericType and item % 2 == 0
end

array = [1, 2, 3, 4, "string", 5, 6]
inspect( filter( passEven, array ) )

To search for an element in an array, arrayFind and arrayScan functions can be used. The arrayFind functions returns the index in the array of the first element matching the second parameter. For example:

a = [ 1, 2, "alpha", 4, "alpha", "beta" ]
> "First alpha is found at... ", arrayFind( a, "alpha" )

Be sure to read the Array function section in the Function Reference for more details on the topic.

Comma-less arrays

When there are very long sequences of items, or when functional programming is involved, using a comma to separate tokens can be a bit clumsy and error prone.

Commas offer a certain protection against simple writing errors, but once you gain a bit of confidence with the language, it is easier to use the "dot-square" array declarator. The following declarations are equivalent:

a1 = [1, 2, 3, 'a', 'b', var1 + var2, var3 * var4, [x,y,z]]
a2 = .[ 1 2 3 4 'a' 'b' var1 + var2 var3 * var4 .[x y z]]

When using this second notation, it is important to be careful about parenthesis as they may appear to be function calls, strings (they may get merged, as we'll see in the next chapter), sub-arrays (they may be interpreted as the index accessor of the previous item) and so on, but when programming in a functional context, where function calls and in-line expression evaluations are rare if not forbidden, this second notation may feel more natural.

The arrays declared with dot-square notation may contain commas if this is necessary to distinguish different elements; in this case, consider putting the comma at the immediate left of the element that they are meant to separate. For example, here's an array in which we need a range after a symbol:

array = .[ somesym  ,[1:2] ]

In this case without the comma separating the two, the range would be applied to the preceding symbol.

Strings

Other than a basic type, strings can be considered a basic data structure as they can be accessed exactly like arrays that only have characters as items. Characters are treated as single element strings (they are just a string of length 1). It is possible to assign a new string to any element of an older one. Here is an example of the string functionality:

string = "Hello world"

/* Access test */
i = 0
while i < len( string ) - 1
   >> string[i], ","  // H,e,l,l,o, ,w,o,r,l,
   i++  // add one to i
end
> string[-1]          // d

/* Range access tests */
printl( string[0:5] )       // Hello
printl( string[6:] )        // world
printl( string[-1:6] )     // dlrow
printl( string[-2:] )       // ld
printl( string[-1:0] )     // dlrow olleH

/* Range assignment tests */
string[5:6] = " - "
printl( string )            // Hello - world
string[5:8] = " "
printl( string )            // Hello world

/* Concatenation tests */
string = string[0:6] + "old" + string[5:]
printl( string )            // Hello old world
string[0:5] = "Goodbye"
string[8:] = "cruel" + string[7:]
printl( string )            // Goodbye cruel old world

/* end */

Assigning a string to a single character of another string will cause that character to be changed with the first character from the other string:

string = "Hello world"
string[5] = "-xxxx"    // "Hello-world", the x characters are not used

Multiline strings

Strings can span multiple lines; starting the string with a single/double quote followed directly by an End Of Line (EOL) will cause the string to span on multiple lines, until another quote character is found.

longString = "
     Aye, matey, this is a very long string.
     Let me tell you about my remembering
     of when men were men, women were women,
     and life was so great."

printl( longString )

You'll notice that the spaces and tabs in front of each line are not inserted in the final string; this is to allow you to create wide strings respecting the indentation of the original block. To insert a newline, the literal \n can be used. It is also possible to use the literal multiline string (see below).

To perserve the whole formating (to include newlines) in a string declaration one can use single quotes. Here is a quick example.

str = '
ABC
123
'

dstr = "
ABC
123
"

iStr = '
国際ストリング
国際ストリング
'

iDStr = "
国際ストリング
国際ストリング
"

> @ "1 $str"
> @ "2 $dstr"
> @ "3 $iStr"
> @ "4 $iDStr"

The above will output:

1 ABC
123

2 ABC 123
3 国際ストリング
国際ストリング

4 国際ストリング 国際ストリング

A finer control can achieved through explicit string concatenation, using the + operator (that can be placed also at the end of a line to concatenate with the following string):

longString = "Aye, matey, this is a very long string.\n" +
             "  Let me tell you about my remembering\n" +
             "      of when men were men, women were women,\n" +
             "            and life was so great."
printl( longString )

You will have a long string on the console.

Falcon strings support escape characters in C-like style: a backslash introduces a special character of some sort. Suppose you want to format the above text so that every line goes one after another, with a little indentation so that it is known as a "citation".

longString = "\t Aye, matey, this is a very long string.\n" +
              "\t   Let me tell you about my remembering\n" +
              "\t   of when men were men, women were women,\n" +
              "\t   and life was so great."

printl( longString )

The \n sequence tells Falcon to skip to the next line, while the \t instructs it to add a "tab" character, a special sequence of (usually) eight spaces.

Other escape sequences are the \", \\, \b and \r. The sequence \" will put a quote in the string, so that it is possible to also print quotes; for example:

printl( "This is a \"quoted\" string." )

The "\\" sequence allows the insertion of a literal backslash in the string, for example:

myfile = "C:\\mydir\\file.txt"

will be expanded into C:\mydir\file.txt

The \r escape sequence is used to make the output restart from the beginning of the current line. It's a very rudimentary way to print some changing text without scrolling the text all over the screen, but is commonly used for effects like debug counters or console based progress indicators. Try this:

i = 0
while i < 100000
   print( "I is now: ", i++ , "\r" )
end

Similarly, the \b escape causes the output to go back exactly one character.

print( "I is now: " )
i = 0
while i < 100000
   print( i )
   if i < 10
        print( "\b" )
   elif i < 100
        print( "\b\b" )
   elif i < 1000
        print( "\b\b\b" )
   elif i < 10000
        print( "\b\b\b\b" )
   else
        print( "\b\b\b\b\b" )
   end
   i++
end
printl()

International strings

Falcon strings can contain any Unicode character. The Falcon compiler can input source files written in various encodings. UTF-8 and UTF-16 and ISO8859-1 (also known as Latin-1) are the most common; Unicode characters can also be inserted directly into a string via escapes. For example, it is possible to write the following statement:

string = "国際ストリング"
printl( string )

The printl function will write the contents of the string on the standard Virtual Machine output stream. The final outcome will depend on the output encoding. The Falcon command line sets the output stream to be a text stream having the encoding detected on the machine as output encoding. If the output encoder is not able to render the characters they will be translated into "?". Another method to input Unicode characters is to use numeric escapes. Falcon parses two kinds of numeric escapes: "\0" followed by an octal number and "\x" followed by an hexadecimal number. For example:

string = "Alpha, beta, gamma: \x03b1, \x03B2, \x03b3"
printl( string )

The case of the hexadecimal character is not relevant.

Finally, when assigning an integer number between 0 and 2^32 (that is, the maximum allowed by the Unicode standard) to a string portion via the array accessor operator (square brackets), the given portion will be changed into the specified Unicode character.

string = "Beta: "
string[5] = 0x3B2
printl( string )  // will print Beta:β

Accessing the nth character with the square brackets operator will cause a single character string to be produced. However, it is possible to query the Unicode value of the nth character with the bracket-star operator using the star square operator ([*]):

string = "Beta:β"
i = 0 
while i < string.len()
    > string[i], "=", string[* i++]
end

This code will print each character in the string along with its Unicode ID in decimal format. If you need to internationalize your program, you may want to examine the Program Internationalization section.

String replication

It is possible to replicate a string a certain number of times using the * (star) operator. For example:

  sep = "-*-" * 12
  > sep
  > " "*25 + "Hello there!"
  > sep

Notice the expression " "*25 + "Hello there!", concatenating the result of the first string expansion to the last part of the string.

String-to-number concatenation

Adding an item to a string causes the item to be converted to string and then concatenated. For example, adding 100 to "value" ...

string = "Value: " + 100       
> string                         // prints "Value: 100"

In the special case of numbers, it is possible to add a character by its unicode value to a string through the % (modulo) operator. For example, to add an "A" character, whose unicode value is 65, it is possible to do:

string = "Value: " % 65
> string                         // prints "Value: A"
string %= 0x3B2            
> string                         // "Value: Aβ"

The / (slash) operator modifies the value of the last character in the string, adding to its UNICODE value the value you provide. For example, to get 'd' from 'a' and the other way around:

d_letter = "a" / 3           // chr( ord('a') + 3) == 'd'
a_letter = d_letter / -3     // chr( ord('d') - 3) == 'a'
> a_letter, ", ", d_letter

String polymorphism

In Falcon, to store and handle efficiently strings, strings are built on a buffer in which each character occupies a fixed space. The size of each character is determined by the size in bytes needed by the widest character to be stored. For Latin letters, and for all the Unicode characters whose code is less than 256, only one byte is needed. For the vast majority of currently used alphabets, including Chinese, Japanese, Arabic, Hebrew, Hindi and so on, two bytes are required. For unusual symbols like musical notation characters four bytes are needed. In this example:

string = "Beta: "
string[5] = 0x3B2
printl( string )  // will print "Beta:β"

the string variable was initially holding a string in which each character could have been represented with one byte.

The string was occupying exactly six bytes in memory. When we added β the character size requirement changed. The string has been copied into a wider space. Now, twelve characters are needed as β Unicode value is 946 and two bytes are needed to represent it.

When reading raw data from a file or a stream (i.e. a network stream), the incoming data is always stored byte per byte in a Falcon string. In this way binary files can be manipulated efficiently; the string can be seen just as a vector of bytes as using the [*] operator gives access to the nth byte value. This allows for extremely efficient binary data manipulation.

However, those strings are not special. They are just loaded by inserting 0-255 character values into each memory slot, which is declared to be 1 byte long. Inserting a character requiring more space will create a copy of each byte in the string in a wider memory area.

Files and streams can be directly loaded using transcoders. With transcoder usage, loaded strings may contain any character the transcoder is able to recognize and decode.

Strings can be saved to files by both just considering their binary content or by filtering them through a transcoder. In the case that a transcoded stream is used, the output file will be a binary file representing the characters held in the string as per the encoding rules.

Although this mixed string valence, that uses fully internationalized multi-byte character sequences and binary byte buffers, could be confusing at first, it allows for flexible and extremely efficient manipulation of binary data and string characters depending on the need.

It is possible to know the number of bytes occupied by every character in a string through the String.charSize method of each string; the same method allows to change the character size at any moment. See the following example:

str = "greek: αβγ"
> str.charSize()      // prints 2
str.charSize( 1 )      // squeeze the characters
> str                       // "greek: " + some garbage

This may be useful to prepare a string to receive international characters at a moment’s notice, avoiding paying the cost for character size conversion. For example, suppose you're reading a text file in which you expect to find some international characters at some point. By configuring the size of the accumulator string ahead of time you prevent the overhead of determining character byte size giving you a constant insertion time for each operation:

str = ""
str.charSize( 2 )
file = ...

while not file.eof()
   str += file.read( 512 )
end

Valid values for String.charSize() are 1, 2 and 4

Literal Strings

Strings can be also declared with single quotes, like this:

str = 'this is a string'

The difference with respect to the double quote is that literal strings do not support any escape sequence. If you need to insert a single quote in a literal string, use the special sequence '' (two single quotes one after another), like in the following example:

> 'Hello ''quoted'' world!'      // Will print "Hello 'quoted' world"

Parsing of literal strings is not particularly faster or more efficient than parsing of standard strings; they have been introduced mainly to allow short strings with backslashes to be more readable. For example, they are useful with Regular expressions where backslashes already have a meaning.

When used as multiline strings, single quoted strings will include all the EOL and blanks present in the source. For example:

multi = '
   Three spaces before me...
    And four here, and on a new line
     and now five on the final line.'

printl( multi )

String expansion operator

Many scripting languages have a means to "expand" strings with inline variables. Falcon is no exception, and actually it adds an important functionality to currently known and used string expansion constructs: inline format specifications. This combination allows for an extreme precise and powerful "pretty print" construct which we are going to show now in detail.

Strings containing a "$" followed by a variable can be expanded using the unary operator "@". For example:

value = 1000
printl( @ "Value is $value" )

This will print "Value is 1000". Of course, the string can be a variable, or even composed of many parts. For example:

value = 1000
chr = "$"
string = "Value is " + chr +"value"
> "Expanding ", string, " into ", @ string

The variable after the "$" sign is actually interpreted as an "accessible" variable; this means that it may have an array accessor like this:

array = [ 100, 200, 300 ]
printl( @ "Array is $array[0], $array[1], $array[2]"  )
}}

Actually, everything parsed inside an accessor will be expanded. For example:

{{{ fal
array = [ 100, 200, 300 ]
value = 2
printl( @ "The selected value is $(array[ value ])"  )

The object member "dot" accessor can also be used and interleaved with the array accessor; but we'll see this in the character dedicated to objects. For now, just remember that a "." cannot immediately follow a "$" symbol, or it will be interpreted as if a certain property of an object were to be searched.

printl( @ "Value is now $(value)."  )

In this way the parser will understand that the "." after the array[value] symbol is not meant to be a part of the symbol itself. A string literal may be used as dictionary accessor in an expanded string either by using single quotes (') or escaping double quotes, but always inside parenthesis, as in this example:

dict = [ "a" => 1, "b" => 2]
> @ "A is $(dict['a']), and B is $(dict[\"b\"])"

To specify how to format a certain variable, use the ":" colon after the inlined symbol name and use a format string. A format string is a sequence of commands used to define how the expansion should be performed; a complete exposition is beyond the scope of this guide (the full reference is in the function reference manual, in the "Format" class chapter), but we'll describe a minimum set of commands here to explain basic usage:

A plain number indicates "the size of the field"; that is, how many characters with which the output should be wrapped.
the 'r' letter forces alignment to the right.
A dot followed by a plain number indicates the number of decimals.

For example, to print an account book with 3 decimal precision, do the following:

data = [ 'a' => 1.32, 'b2' => 45.15, 'k69' => 12.4 ]

for id, value in data
   printl( @ "Account number $(id:3):$(value:8.3r)" )
end

The result is:

Account number a  :   1.320
Account number b2 :  45.150
Account number k69:  12.400

As it can be seen, the normal (left) padding was applied to the ID while the right padding and fixed decimal count was applied to the value. Formats can be also applied to strings and even to objects, as in this example:

data = [ "brown", "smith", "o'neill", "yellow" ]
i = 0
while i < data.len()
   value = data[i++]
   printl( @ "Agents in matrix:$(value:10r)" )
end

The result is:

Agents in matrix:     brown
Agents in matrix:     smith
Agents in matrix:   o'neill
Agents in matrix:    yellow

The sequence "$$" is expanded as "$". This makes possible to have iterative string expansions like the following:

value = 1000
str = @ "$$$value"
printl( str )

Or more compactly:
value = 1000
str = "$$$value"
> @ str

In case of a parsing error, or if a variable is not present in the VM (i.e. not declared in the module and not explicitly imported), or if an invalid format is given, an error will be raised that can be managed by the calling script. We'll see more about error raising and management later on.

String manipulation functions

Falcon provides functions meant to operate on strings and make string management easier. Classic functions such as trim (elimination of front/rear blank characters), uppercase/lowercase transformations, split and join, substrings and so on are provided. For example, the following code will split "Hello world" and work on each side:

h, w = strSplit( "Hello world", " " )
> "First letter of first part: ", strFront( h, 1 )
> "Last letter of the second part: ", strBack( w, 1 )
> "World uppercased: ", strUpper( w )

Several interesting functions are strReplicate that builds a "stub" sequence repeating a string, and strBuffer, which creates a pre-allocated empty string. A string allocated with strBuffer can then be used as buffer for memory based operations such as iterative reading of data blocks from binary files.

For more details on Falcon's support of strings, read the String functions section in the Function Reference.

Dictionaries

The most flexible basic structure is the Dictionary. A dictionary looks like an array that may have any object as its index. Most notably, the dictionary index may be a string. More formally, a dictionary is defined as a set of pairs, of which the first element is called key and the second value. It is possible to find a value in a dictionary by knowing its key. Dictionaries are defined using the arrow operator (=>) that couples a key with its value. Here is a minimal example:

dict = [ => ]         // creates an empty dictionary
dict = [ "a" => 123, "b" => "onetwothree" ]
printl( dict["a"] ,":", dict["b"] )     // 123:onetwothree

Of course, the keys and values can be expressions, resulting in both in dictionary definition and in dictionary access:

a = "one"
b = "two"
dict = [ a + b => 12, a => 1, b => 2 ]
printl( dict[ "onetwo" ] )             // 12
printl( dict[ a + b ] )                // 12 again

Dictionaries do not support ranges. To extend a dictionary, it is possible to just name an nonexistent key as its index; if the element is an already existing key, the value associated with that key is changed. If it's a nonexistent key the pair is added:

dict = [ "one" => 1 ]
dict[ "two" ] = 2
dict[ "three" ] = 3
// dict is now [ "one" => 1, "two" => 2, "three" => 3 ]

It is also possible to “sum” two dictionaries; the resulting dictionary is a copy of the first addend, with the items of the second added being inserted over the first one. This means that in case of intersection in the key space, the value of the second addend will be used:

dict = [ "one" => 1, "two" => 2 ] + [ "three" => 3, "four" => 4 ]
// dict is now [ "one" => 1, "two" => 2, "three" => 3, "four" => 4 ]

dict = [ "one" => 1, "two" => 2 ] + [ "two" => "new value", "three" => 3 ]
// dict is now [ "one" => 1, "two" => "new value", "three" => 3 ]

dict += [ "two" => -2, "four" => 4 ]
// dict is now [ "one" => 1, "two" => -2, "three" => 3, "four" => 4 ]

On the other hand, accessing an nonexistent key will raise an error, like trying to access an array out its bounds:

dict = [ "one" => 1 ]
printl( dict[ "two" ] )    // raises an error

To selectively remove elements from a dictionary, it is possible to use the “-” (minus) operator. Nothing is done if trying to remove an item that is not contained in the dictionary:

a = [ 1=>'1', 2=>'2', "alpha"=>0, "beta"=>1 ]
b = a - 2              // b = [ 1=>'1', "alpha"=>0, "beta"=>1 ] 
c = a - [ 1, "alpha" ] // c = [ 2=>'2', "beta"=>1 ]
c -= 2                 // c = [ "beta"=>1 ]
a -= b                 // a = [ 2=>'2' ]
a -= "no item"         // a is unchanged; no effect

Dictionary support functions

Falcon offers some functions that are highly important to complete the dictionary model. For example, the most direct and simple way to remove an item from a dictionary is to use the dictRemove function (or remove method):

a = [ 1=>'1', 2=>'2', "alpha"=>0, "beta"=>1 ]
dictRemove( a, "alpha" )
inspect( a )   // alpha is not in the dictionary anymore.

a.remove( "beta" )
inspect( a )   // and now, beta is gone too.

It is also possible to remove all the elements using the dictClear function or clear method). Other interesting functions are dictKeys and dictValues (again, with corresponding dictionary methods keys and values), which create a vector containing respectively all the keys and all the values in the dictionary.

Serious operations on dictionaries require the recording of a position and proceeding in some direction. For example, in the case it is necessary to retrieve all the values having a key which starts with the letter “N”, it is necessary to get the position of the first element whose key starts with “N” and the scan forward in the dictionary until the key changes first letter or the end of the dictionary is reached.

To work on dictionaries like this, Falcon provides two functions called dictFind and dictBest (or find and best methods), which return an instance of a class called Iterator. We'll see more about iterators in the next sections.

Be sure to read the section called Dictionary functions in the function reference.

Lists

We have seen that arrays can be used to add or remove elements randomly from any position. However, this has a cost that grows geometrically as the size of an array grows. Insertion and removal in lists are more efficient, by far, when the number of elements grows beyond the size a simple script usually deals with. Switching from arrays to lists should be considered at about 100 items.

Contrary to strings, arrays and dictionaries, Lists are full-featured Falcon objects. They are a class, and when a list is created, it is an instance of the List class. Objects and classes are described in a further chapter, but Lists are treated here for completeness.

A list is declared by assigning the return value of the List constructor to a variable.

l = List( "a", "b", "c" )
Operations that can be performed on a list are inspection of the first and last element and insertion and removal of an element at both sides.
Some examples below:
> "Elements in list: ", l.len()
> "First element: ", l.front()
> "Last element: ", l.back()

// inserting an element in front
l.pushFront( "newFront" )
> "New first element: ", l.front()

// Pushing an element at bottom
l.push( "newBack" )
> "New first element: ", l.back()

// Removing first and last element
l.popFront()
l.pop()
> "Element count now: ", l.len()

Lists also support iterator access; it's possible to traverse a list, insert or remove an element from a certain position through an iterator or using a for/in loop. We'll treat those arguments below.

The "in" operator

The in relational operator checks for an item to its left to be present in a sequence to its right. It never raises an error, even if the right operand is not a sequence; instead, it assumes the value of true (1) if the item is found or 0 (false) if the item is not found, or if the right element is not a sequence.

The in operator can check for substrings in strings, or for items in arrays, or for keys in dictionaries. This is an example:

print( "Enter your name > " )
name = input()

if "abba" in name
   printl( "Your name contains a famous pop group name" )
end

dict = [ "one" => 1 ]
if "one" in dict
   printl( "always true" )
end

There is also an unary operator notin working as not (x in name):

if "abba" notin name
   printl( "Your name does not contain a famous pop group name" )
end

The for/in loop

The for/in loop traverses a collection of items (an array, a dictionary, a list or other application/module specific collections), usually from the first item to the last one, and provides the user with a variable assuming the value of each element in turn. The loop can be interrupted at any point using the break statement, and it is possible to skip immediately to the next item with the continue statement. The value being currently processed can be changed with a special operator, called “dot assign”, and the continue dropping statement discards the currently processed item, continuing the processing loop from the next one.

The for/in loop can also be applied to strings, where it picks all the characters from the first to the last, and to ranges, to generate sequences of integer numbers. A special application of the for/in loop to ranges is the for/to loop, which follows a slightly different semantics.

Other than the main body, the for/in loop can contain three special blocks: forfirst, forlast and formiddle blocks can contain code that is respectively executed before the first item, after the last item, and after every item that is not the last (between items, essentially).

Loop control statements (namely break, continue and continue dropping) being declared in the main block will prevent formiddle and forlast blocks from being executed. If they are contained in the forfirst block, even the main block for the first item is skipped, as forfirst block is executed before the main block.

This is the formal declaration of the for/in block:

for variable[,variable...] in collection
   ...statements...
   [break | continue | continue dropping]
   ...statements...
	
   forfirst
   	... first time only statements ...
   end

   formiddle 
   	... statements executed between element processing ...
   end

   forlast 
   	... last time only statements ...
   end
end

The forfirst, forlast and formiddle blocks can be declared in any order or position; actually, the can even be interleaved with the main for/in block code; the code will just be separated and executed sequentially. As with any block, they can be abbreviated using the “:” colon shortcut.

This example will print “Contents of the array: Have a nice day!” on a single line.

array = [ "Have", "a", "nice", "day" ]

for element in array
   forfirst: print( "Content of the array: " )
   
   // this is the main for/in body
   print( element )
   
   formiddle: print( " " )
   forlast: printl( "!" )
end

Using forfirst and forlast blocks in the for/in loop will allow actions to take place only if the collection is not empty, exactly before the first element and after the last one. Using those blocks, there isn't the need of extra checks around the collection traversal loop. Also, the special blocks in the for/in loop are managed at VM level, and are considerably faster than using repeated checks in a normal loop.

An empty set, that is, an array or a dictionary with no elements, will cause the for/in loop to be skipped altogether. A nil value will be interpreted as an empty set, so the following:

array = nil

for element in array
   print( element, " " )
end

will just be just skipped. A for/in loop applied to a dictionary requires two variables to be used; the first one will receive the current key, and the second one will store the entry value:

dict = [ "Have" => 1 , "a" => 2, "nice" => 3, "day" => 4 ]

for key, value in dict
   printl( "Key: ", key, " Value: ", value )
end

This technique will also work with multidimensional arrays, provided that every element is an array of the same size:

matrix = [ [1, 2, 3], [3, 4, 5], [5, 6, 7] ]

for i1, i2, i3 in matrix
   printl( i1, ",", i2, ",", i3  )
end

The values which are retrieved in the for/in loop can also be changed on the fly. To do this, use the unary operator “.=” (called dot-assign), that changes the currently scanned item without altering the loop variable, like in this example:

array = [ 1, 2, 3 ,4, 5 ]

for elem in array
   .= 0   		// sets all the array elements to zero...
   printl(elem)	// ... but prints the original items	
end

for elem in array
   printl( elem )      // prints five zeros
end

In the case of a dictionary being traversed the function dot-assign operator will also change the current value of the dictionary. The current key of a dictionary cannot be changed.

To remove an item from a collection, use the continue dropping statement; for example, the following code will filter the source array so that only even numbers are left:

array = [ 1, 2, 3, 4, 5 ]

for elem in array
   if elem % 2 == 1
      continue dropping
   end
   printl( "We accepted the even number: ", elem )
end

As shown, the continue dropping statement will also skip the rest of the main for/in body, as well as formiddle and forlast blocks, if present.

For/in ranges

The range based for/in loop is an efficient way to generate increasing or decreasing values. The target variable of the for/in is filled each loop with an integer value.

If the range beginning is lower than the end, the index variable will be filled with values from the beginning, included, to the end, excluded. So:

for value in [1:10]
   printl( value )
end

will print a sequence between 1 and 9. Contrarily, if the beginning of the range is higher than the end, the variable will be filled with decreasing values, including the end limit. This resembles the way that ranges are used to extract substrings or subarrays.

If the range is open, or if it is empty ([n:n]), then the for/in loop is completely skipped.

The continue dropping statement is performed, but it is translated to a simple continue. The dot-assignment has no effect.

Ranges in for/in loop support steps; a third parameter may be specified in the range to indicate a stepping value; for example, the following loop shows the pair numbers between 0 and 10:

for value in [ 0: 11: 2 ]
   > value
end

If the step is zero, or if it's greater than zero when the loop would be descending or if it's less than zero when the loop would be ascending, the for/in statement is skipped. In case the direction of the loop is unknown because of variable parameters, the step can be used to ensure a certain processing order:

array = [ 1, 2, 3, 4, 5 ]
for index in [ start : len( array ) : 1 ]  // so if start too high, we'll just skip
   > array[ index ]
end

For/to loops

The for/to loop works as a for/in loop in ranges, but it includes the upper limit of the range. It is declared as:

for variable = lowerbound to upperbound [, step]
   // for/to body, same as for/in
end

For example, this will inclusively count 1 to 10, treating 1 and 10 a bit special:

for i = 1 to 10
   forfirst: >> "Starting: "
   
   >> i

   formiddle: >> ", "
   forlast: > "."
end

For loops maintain their own loop counter variable. Consider this example. The loop counter is assign a new value by adding two to its value. After each loop, the cnt variable is reset to the next loop counter.

for cnt = 1 to 3
   cnt += 2
   > cnt
end

Outputs:

3  // cnt = 1; 1 + 2 = 3
4  // cnt = 2, 2 + 2 = 4
5  // cnt = 3, 3 + 2 = 5

The step clause works exactly as in for/in ranges, declaring the direction of a loop and eventually having the loop skipped if the direction is wrong. For example, this prints all the pair numbers between 1 and 10 included:

for i = 2 to 10, 2
   > i
end

For/in lists

Lists can be processed through a for/in loop exactly as arrays. Positional blocks will work as for any other type; continue dropping statement removes the current element, while the dot assign operator changes the value of the current element. For example:

list = List( "Have", "a", "nice", "day" )
for value in list
   forfirst: >> "The list is... "
   >> value
   formiddle: >> " "
   forlast: > "!"
end

Although lists can also be traversed with iterators, the for/in loop is completely VM driven, and thus it is more efficient; it also uses a simpler internal representation of the iterator, sparing memory.

Iterators have a small chapter on their own, as they are tightly bound with the Object Oriented Programming paradigm supported by Falcon; so, they will be presented after the chapter in which OOP support is described.

For/in generators

The for/in loop supports genetator functions since version Eagle (0.9.4). As we didn't introduce functions and other functional programming elements useful for this construct, we'll descend in details of this structure in the List Comprehension chapter.

Memory buffers

Memory buffers are “tables” of raw memory which can be directly manipulated through Falcon scripts. They are mainly meant to access binary streams or to represent memory mapped data (as images). They may be also used by modules and applications to pass a set of data in a very efficient way to the script, or the other way around, as each access to them refers to a small unsigned integer value in memory. Memory buffers can be sequences of numbers occupying one to four bytes (including three bytes, which is a quite common size for memory mapped images).

Memory buffers cannot grow nor shrink, and it is not possible to access a subrange of them. From a script standpoint, they are table of small integer values. Consider the following example:

memory = MemBuf( 5, 2 )  // creates a table of 5 elements, each 2 bytes long.
for value in [0:5]
   memory[ value ] = value * 256
end

inspect( memory )

The inspect function will show a set of two-bytes binary data inside the buffer:

MemBuf(5,2) [
0000 0100 0200 0300 0400 ]

Hexadecimal 0100 value equals 256, 0200 is 512 and so on.

Functions dealing with files may be given a string or a memory buffer to fill. In the second case, manipulation of binary data may be easier. Strings can be used to manipulate binary data too (as it is possible to access their content by the value of each character), but memory buffers are more fit for that task.

Bitwise operators

Dealing with binary data often requires checking, setting or resetting specific bits. Falcon support bitwise operations on integer data (memory buffer elements, string characters accessed by numeric value or integer items).

The bitwise and '&&', bitwise or '||', bitwise xor '^^' and bitwise not '~' operators allow the changing bits of integer values through binary math. And, or and xor operator are binary, while not operator is unary. For example,

value = 0x1 || 0x2   // or bits 0 and 1

// display binary:
> @"$(value:b)b = $(value:X)H = $value"

value = value && 0x1 // turns off bit 2
> @"$(value:b)b = $(value:X)H = $value"

value = value && 0xFFFF // Shows first 2 bytes of reversed value
> @"$(value:b)b = $(value:X)H = $value"

value = value ^^ 0x3  // turns off bit 2 and on bit 1
> @"$(value:b)b = $(value:X)H = $value"

Shift operators are also provided. Shift left “<<” and shift right “>>” allow moving bits at an arbitrary position:

for power in [0:16]
   value = 1 << power
   > @"2^$(power:2r): $(value:b17r)b = $(value:X4r)H = $value"
end

And, or, xor, shift left and shift right operators are also provided in the short assignment version; for brevity, and, or and xor assignments are respectively “&=”, “|=” and “^=”, while to avoid confusion with relational operators shift left and shift right assignments are indicated with “<<=” and “>>=”.

value = 0xFF00		// set an initial value
value &= 0xF00F      // Try an and...
> @"$(value:X)H" 	// shall be F000H

value >>= 4		// drag a semibyte to the right
> @"$(value:X)H" 	// shall be F00H