This is the user’s/programmer’s guide for SensorWare; the framework that allows easy, efficient dynamic programmability for sensor networks. Once SensorWare is installed in your sensor nodes, a user can invoke a distributed algorithm into the network without having to serially reprogram each sensor node, or even carry the new binary images using the wireless medium. The invocation involves minimum user involvement (just start running the algorithm in the user node) and still the sensor network can be shared among multiple non-coordinating users. Writing a distributed algorithm in SensorWare is considerably facilitated by the proper abstractions. The programmer does not have to worry about the details related to embedded programming as these are abstracted away through a set of services. These services are exposed and tied together into tasks with the help of a scripting language. The programmer has to worry about the design of the distributed algorithm though, which means both the distribution and the operation of individual tasks. This task is facilitated, among other things, by the event-driven programming model and the ability of the code to autonomously roam the network.

SensorWare could prove beneficial for the researcher of sensor network algorithms, as it can allow him/her to pass the barrier of plain simulation validation and move into a real platform territory. If the algorithm is written in SensorWare the researcher can do a first order test using our SensorWare-based simulation platform and then seamlessly move to the real platform, without changing one byte from the code.

In this document we will not go into details about the design choices of SensorWare. The interested reader can refer to the published papers in OpenArch 2002, MobiSys 2003 and the recent submission to IEEE Transactions on Mobile Computing. We provide information on 1) the goals and utility of SensorWare, 2) how to find and install SensorWare, 3) SensorWare’s features and programming model, and 4) how you actually program using SensorWare.

Before reading section 2.1 and beyond, familiarize yourself with the Tcl scripting language, as SensorWare’s scripting language is based on Tcl. Only the very basic parts of Tcl are used (i.e., variables, expressions, substitution rules, control flow commands, procedure definition) so you do not need the newest textbook about Tcl. A good source of information is Ousterhout’s book “Tcl and the Tk toolkit” (Addison-Wesley, 1994). For a quicker access to documentation try http://www.tcl.tk/doc/. In particular, look at: http://www.tcl.tk/scripting/ primer.html. Also the book “Practical Programming in Tcl and Tk” has some chapters online, at the url: http://www.beedub.com/book/. You can look at the chapters “Tcl Introduction” and “Eval and Quoting”.

Happy programming!

2 Basics

SensorWare abstracts the run-time environment of a sensor node using a set of services, and a scripting language (an extension of Tcl) to form scripts out of these services. These scripts perform certain tasks when executing in a node. Scripts can also move their code and data from node to node, autonomously. Thus, distributed algorithms are realized as control scripts that are autonomously populated (i.e. replicated or migrated) in the “proper” sensor nodes after a triggering user injection.

Figure 1 shows SensorWare’s place in a sensor node and provides an opportunity to refer to some key notions of the framework. The architecture of a sensor node can be viewed in layers. The lower layers are the raw hardware and the hardware abstraction layer (i.e., the device drivers). An operating system (OS) is on top of the lower layers. The OS provides all the standard functions and services of a multi-threaded environment that are needed by the layers above it. The SensorWare layer uses those functions and services offered by the OS to provide the run-time environment for the control scripts. The control scripts rely completely on the SensorWare layer while populating around the network. Control scripts use the native services that SensorWare provides as well as services provided by other scripts to construct distributed applications. Most of the time, scripts will communicate with scripts of the same kind (i.e., having the same code) at remote nodes. Scripts can also interact with scripts of a different kind through specific interfaces as we will see in subsequent sections. Usually this happens with scripts in the same node, with one script offering a service to the other script. As it is seen in the figure, the communications can be roughly divided into two levels: A script level, and a SensorWare native level. Certainly all communications happen through the real hardware but this distinction wants to highlight the layer where the source and destination entities reside. At script level we have custom communication created by individual scripts. At SensorWare level we have communication created by fixed distributed services that SensorWare provides (e.g., neighborhood discovery service, time synchronization service). The part of such a service in a node needs to communicate with its corresponding parts at remote nodes (hence the characterization “distributed service”).

Figure 1: SensorWare’s place in a sensor node

We will not delve deeper into SensorWare's features now. Instead, we take a tutorial-like approach of revealing SensorWare to the reader and accordingly present key ideas as the "tutorial" unravels. After enough key concepts are exposed and the reader has some familiarity with Sensorware we proceed (in section 3 ) to present and explain pieces of code that actually do something useful. Section 4 discusses some advanced features and gives more code examples.

2.1 Downloading and installing SensorWare

Once you have a general idea of SensorWare’s goals and utility, as well as a relative familiarity with Tcl basics, it is time to dive into a more hands-on experience with SensorWare.

If you want to use SensorWare only as a tool to easily code your distributed algorithm and test it in a simulation platform (which has several simplifying assumptions) then you just need a Linux box. If you are planning to test your algorithm on a real platform you will need sensor nodes that have an ARM-based processor and can run Linux (we have used iPAQs as the basis for our nodes; the new StarGates could be used as well). In order to get started, let’s assume that you run SensorWare in a single machine, as a code development and simulation tool.

SensorWare is available at http://sourceforge.net/projects/sensorware. From the web site you can browse the CVS repository and get the instructions to download SensorWare. Here we provide the simple steps to anonymous CVS access.

In your Linux machine go to a directory that you intend to have sensorware as a subdirectory and do the following:

[boulis@surya]$ cvs –d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/sensorware login

When prompted for a password for anonymous, simply press the Enter key. Then type:

[boulis@surya]$ cvs –z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/sensorware co sensorware

All the directory structure and files of SensorWare should now be in your machine. To create the executable(s) use the provided Makefile. For our case (simulation) do:

[boulis@surya]$cd sensorware

[boulis@surya]$ make cygwin

The executables are placed in sensorware/obj.cygwin. There is essentially one executable file called sw_cygwin that you can run in interactive or non-interactive mode. In interactive mode the user opens a new console and executes one instance of SensorWare there. The user can then input commands or invoke whole scripts from this console. This is essentially the mode of SensorWare running at the user node. In non-interactive mode the user can start a bunch of SensorWare instances from the same console (with no input capabilities). These are supposed to act as the instances of SensorWare running at the sensor nodes. Usually you start the non-interactive instances using the sim.sh shell script. You type sim.sh followed by the number of instances you want to execute. To kill the non-interactive instances use the end_sim.sh.

Before executing any instance of SensorWare, a file named topo.def should exist in your obj.cygwin directory (or any other directory were the executables are invoked). There is a default file provided. This file defines the topology of your network in a text format. The first line is the number of nodes. Any consecutive lines start with the node id, followed by the location of the node (x coordinate y coordinate) and the radius of the radio connectivity (a simplistic disc model with binary connectivity is adopted for the radio). Each time you run an instance of SensorWare (of the cygwin version) the system couples that instance with n+1th node from the topo.def file, where n is the number of SensorWare instances that are already running. Here is what happens if you run an interactive instance of SensorWare:

[boulis@surya]$ ./sw_cygwin

assign node id to 1 (no other instances were running, so id=1)

swsh@1.1.1.1%

This is the user prompt. “swsh” stands for “sensorware shell”. The number following the @ sign is the full id of the special script that allows the interaction with the user. This full id has the form “node_id.user_id.1.1”.

Since the process of using SensorWare is interactive (i.e., the user types some command(s) at the user node prompt, and the system replies) we need a clear way to separate between characters output by the system and characters input by the user. Thus all code samples are written with the following formatting conventions: System output is given in Courier New font and user input as bold Times New Roman. The user prompt ends with the character %. We may include comments in the input or outputs in the form of (italicized Times New Roman inside parentheses).

If you want to install and test SensorWare in a real distributed platform you have to build the executables for the right platform. Right now we have executables for the arm-linux platform. You should run the executable sw_arm in each node (e.g., iPAQ) you want to have SensorWare. You should also include a topo.def file in each node (the same file for all nodes). The topo.def file is slightly different in the real platform case. After id, the MAC address of the 802.11 radio follows, so that the correct node-SensorWare_instance coupling is made.

2.2 Testing simple code snippets

Using the cygwin version run one instance of interactive SensorWare. After logging in (with user id 1) you are faced with the prompt swsh@1.1.1.1%

How do you write the hello program in SensorWare? Simply type:

swsh@1.1.1.1% debug hello

debug (1.1.1.1): hello (this is what you get as a reply.)

“Debug” is a simple command, which is used to output strings in a node’s output interface (if it exists) for debugging purposes. For now, ignore the cryptic string before the colon. Other simple commands (used for debugging or monitoring purposes) are: “ps”, which lists the active scripts in the current node, and “ls”, which lists the current devices (services) of this SensorWare instance. Try them out.

swsh@1.1.1.1% ps

1.1.1 (only the ”shell” script is active. Node_id is omitted since it is implied)

swsh@1.1.1.1% ls

voronoi light monitor gpsr location timer mbox wlan neighbor time video (this is the list of 11 devices, in no particular order)

You do not need to bother about the utility of these devices (services) yet. We will see more in section 2.4. For now, know that there are certain commands that operate on the devices. One of them is query. The device neighbor keeps the node’s current list of neighbors (among other things). Quering this device returns the list of neighbors.

swsh@1.1.1.1% query neighbor

swsh@1.1.1.1% (no neighbors! No other SensorWare instances are running anyway)

Let’s get some company (neighbors). Let’s run another interactive instance of SensorWare. Open a new console window and type:

[boulis@surya]$ ./sw_cygwin

assign node id to 1 (already one instance running so id=2)

swsh@2.1.1.1% (note the ‘2’ at the prompt indicating that this swsh runs on node 2)

Go back to the first console and query for neighbors.

swsh@1.1.1.1% query neighbor

2 (this is the only neighbor for now)

Let’s start instances of SensorWare in all the remaining nodes (as defined by the topo.def file), and watch how the neighbor’s list changes. The default topo.def file contains 16 nodes; we have already started 2 instances so we need 14 more. Let’s start them in non interactive mode.

Open a new console and from the executables directory type:

[boulis@surya]$ ./sim.sh 14

16956 (you get the 14 pid of the processes that execute the non

16957 interactive version of sw_cygwin)

16958

16961

16962

16963

16964

16965

16966

16967

16968

16969

16970

16971

assign node id to 3 (you also get a simple message from each process)

assign node id to 4

assign node id to 5

assign node id to 6

assign node id to 7

assign node id to 8

assign node id to 9

assign node id to 10

assign node id to 11

assign node id to 12

assign node id to 13

assign node id to 14

assign node id to 15

assign node id to 16

Go back to the first console and query for neighbors once more.

swsh@1.1.1.1% query neighbor

5 2 (we have company…)

The default topo.def file contains 16 nodes in a 4 x 4 square grid. Each node communicates with the nodes immediately up, down, left, and right of it. Node 1 is in the lower left corner, so it has only two neighbors.

At any point you can type exit as a SensorWare command, to terminate the current swsh. You will be faced again with the login prompt, where you can enter 0 to exit the program.

Let us play now with the ability of SensorWare to move and execute code at remote nodes. There is a command called spawn that takes some node specification (more on node specifications in section 2.5.1) and a script and spawns the script at the specified node. A script in SensorWare starts with a tcl comment (a line beginning with an ‘#’) that has the form ‘#code_id <number>’ where <number> is a positive integer identifying the code_id of the script. The rest of the script comprises SensorWare commands, like the ones we have used so far. For a node specification we will use the description "neighbor 2" which for now you can assume it to mean "one of our neighbors with node id 2".

swsh@1.1.1.1% spawn neighbor 2 "#code_id 61 \n debug Hey, I am at the other node! "

(Note the quotes around the script, which are used to bundle what is inside them as one tcl argument. Note also that the quotes allow tcl to do a one-pass substitution inside the quoted string. This way the \n is replaced by an actual newline.)

52 (the reply in the console of node 1. this is the number of bytes spawned)

In the second console you should see:

debug (2.1.61.2): Hey, I am at the other node!

(the debug (x.x.x.x): part is system generated and informs us that the output is produced by the debug command executed by script x.x.x.x. In our case: node id 2 userid 1 code id 61, and script instance 2)

Play with the spawn command. What will happen if you spawn to neighbor 0 (hint: this is the broadcast address)? To an non-existing neighbor? If you included a deliberate mistake in the script? What will happen if you nest a spawn command in the spawned script? Try it out. You might see more interesting things happening when you are using the replicate command. Replicate is a spawn of the script that is currently executing the command (the only exception is the swsh script which cannot be moved). The migrate command does the same and also stops the execution of the current script. Add a replicate command at the end of the spawned script we used for the previous example. Don't forget to type a '\n' or ';' after '... I am at the other node!' so that tcl interprets it as a new command.

swsh@1.1.1.1% spawn neighbor 2 "#code_id 61 \n debug Hey, I am at the other node!; replicate neighbor 0" (take a deep breath and press enter)

By now, you should see an endless loop of messages rolling in the consoles that SensorWare is invoked. First, let's stop this madness using brute force. In the interactive consoles just press ctrl+z. Then from one of these consoles type ./end.sh. This way all the non-interactive consoles are killed but not the interactive ones. These are just stopped. To kill them do a ps –a to find the pids of the ./sw_cygwin processes and do kill –9 <pid> for the first of them. If you are running n interactive instances you need to do this procedure (find the first pid and kill it) n times. Since a user would usually run just one interactive SensorWare instance (we have used at most two interactive instances) we have not automated this procedure as of yet.

What do you think happened? What created the endless loop? If you follow the actions of the script step by step you’ll find out. The script is first spawned (i.e., transferred and executed) in node 2. The script executes the debug command and then replicates itself in all neighboring nodes. We should clarify here that the script does not wait for confirmation that the remote nodes are actually executing its copies; it just exits as soon as the replicate command is finished (i.e., as soon as the bytes to be transmitted are handed over to the radio service) since there are no other commands to execute in that particular script. In the neighboring nodes the script executes and replicates once more. One of the replicas comes back to node 2 and executes again. Hence the loop. Should the script was still executing in node 2 we would not have any duplicate copy. The default behavior of replicate it to execute the script at the remote node only there is no script with the same user_id and code_id already running at the remote node.

How can we stop this loop? If we could only add a reasonable delay after the execution of the replicate command (say 1-2 secs) then the loop would be broken due to the default behavior of the replicate command. The script shown in the following spawn command does just this.

swsh@1.1.1.1% spawn neighbor 2 "#code_id 51\n debug Hey!; replicate neighbor 0; interest timer t1 2000; wait t1;"

You should not worry too much in understanding the new commands now. The interest command declares a timer named t1 for 2000 milliseconds and the wait command waits on that timer. Run the command. Now you should just get the message “Hey!” once in all the nodes. What would happen if you had the replicate command after the wait command? (yes, here comes the loop again.) In general you should be careful about short-lived scripts that use the replicate command. An added delay in the form shown above usually addresses the loop problem. In fact in order to alleviate the programmer from such worries we plan to change SensorWare so that a script is considered active even after 2-3 seconds of its termination. We feel though that the programmer should know what is happening behind the scenes so we included this example here.

Forget spawning and replicating for a while. What about sending a simple message to another script. There is a command for that called send. Send takes a node specification, a script address, and the message to be sent, as arguments. The script address has the form user_id.code_id.instance_id when the node_id is known (through the node specification). So the following command sends a message to the swsh of node 2.

swsh@1.1.1.1% send neighbor 2 1.1.1 "where is this message?"

22 (the number of bytes sent as a message)

But there is no output in node’s 2 console. What happened to the message? The message arrived to node 2 and delivered to script 1.1.1 (i.e., the swsh) but since the recipient was not waiting for any network message, it was just discarded.

How do we write a script that waits on a packet and does something when the packet is received? How do scripts wait for events in general? Surely if we want to do anything useful we should react to external events. This question brings us to the next section.

2.3 The programming model

SensorWare scripts (at least the ones that aspire to have some utility) look like state machines that are influenced by external events. Such events include network messages from peers, sensing data, and expiration of timers. It is interesting to discuss here the kind of events that we want the scripts to process and depend upon. Figure 2 gives a classification of events into high and low-level events. At the lower level events are produced due to specific hardware triggers. In a sensor node there are three possible types of such triggers: 1) a packet (or bit) is received at the radio, 2) a sample was acquired by a sensing device, and 3) one of the hardware timers has expired. These events are usually happening in higher frequencies that a script could handle, and they are not controllable, nor parametric.

Figure 2: Classification of events

These raw events though filtered through SensorWare’s node-resident services can produce higher-level events. Examples of such event include: the filling of a buffer with filtered sensor data, the reception of a packet with a specific header (intended for a specific script), the expiration of a virtual timer, or even the completion of the distributed service (e.g., ad-hoc node localization). Clearly such types of events are the ones that SensorWare scripts should handle. They are abstracting away the details of processing the raw events, they have usually much lower frequency than the raw events, but most importantly, they are parametric so the scripts can describe which event they wish to handle.

The programming model that is adopted is equivalent to the following: An event (of the high-level type) is described, and it is tied with the definition of an event handler. The event handler, according to the current state, will do some (light) processing and possibly create some new events or/and alter the current state. Figure 3 illustrates the abstracted SensorWare's programming model with an example.

Figure 3: Abstraction of the programming model

The behavior described above is achieved through the wait command. Using this command, the programmer can define all the events that the script is waiting upon, at a given time. Examples of events that a script can wait upon are: i) reception of a message of a given format, ii) traversal of a threshold for a given sensing device reading, iii) filling of a buffer with sensing data of a given sampling rate, iv) expiration of several timers. When one of the events declared in the wait command occurs, the command terminates, returning the event that caused the termination. The command after the wait command processes the return value and invokes the code that implements the proper event handler. After the execution of the event handler, the script moves to a new wait command, or more usually it loops around and waits for events from the same wait command.

Here is how you would write it in SensorWare:

wait event1 event2 # wait on 2 events

if {$event_name = event1} { # event_name is a predefined variable name

# containing the name of the returning event

...some code...

} else { ... some other code... }

2.4 Everything is a device

How does a programmer describe an event in order to use it with the wait command? The solution is given by the way SensorWare handles variability in sensor nodes. Different sensor node platforms may have different capabilities. For instance, imagine that one platform A has a radio and a magnetometer, while another platform B has two radios (a normal and a paging one) and a camera. How will we abstract the two platforms with the same framework? SensorWare advocates a modular and well-structured solution. SensorWare declares, defines, and support virtual devices (an idea triggered by Linux's virtual devices). Any module or service is represented as a virtual device. For example a radio, a sensing device, the timer service, a location discovery protocol are all view as virtual devices.

There is a fixed interface for all devices. More specifically there are four commands that are used to communicate with the device. They are: query, act, interest, and dispose. Query asks for a piece of information from the device and expects an immediate reply. Act instructs the device to perform an action (e.g., modify some parameters of the device, or if the device is an actuator perform an action). interest describes a specific event that this device can produce and gives this event a name/ID. The name can be used subsequently from the wait command to wait on this specific event. dispose just disposes that name. Additionally, if a device can produce events, a task is needed to accept interest and dispose commands and react to wait commands that are waiting on the device's events. The task definition, and the parsing of the arguments of the four commands are defined in a custom fashion by the developer. This is where the expandability stems from, while at the same time keeping a structured form.

You can view your current devices by typing ls in your swsh prompt.

In the current version of SensorWare there are 15 devices. These are:

light a "virtual" light sensor. Query it to get a light sample back

location query it to get the location of the node back

timer timer service. Declaring interest in this device defines timers

time query it to get the absolute time back [secs msecs] (not sync among nodes)

self route to self

flood routing device that broadcasts n hops away from the calling node

neighbor neighbor discovery and routing. Query it to get the neighbors. Use it to route.

gpsr routing device implementing the GPSR geo routing protocol

user routing device that keeps the next hop to all (heard) users.

voronoi query it to return the area of the voronoi cell around the node.

monitor special device to measure the network traffic generated

measure special empty device used in measuring commands operating on devices

mbox query it to return the number of event in the event queue

wlan the wavelan radio. Obsolete device

video empty for now

With the current functionality of devices you cannot define all the high-level events described in the programming model section. For instance, you cannot define an event "fill an n byte buffer with light samples". In section 5 you can find a complete description of what the devices can be used for.

2.5 How code and data are moved around

We have already seen how to move code and data around in the "Testing simple code snippets" section. It is time to re-approach the subject in a more general way now that we have discussed the key idea of devices.

2.5.1 Addressing and routing

Addresses in SensorWare have the generic format: [nodes_specification mailbox]. Nodes_specification as the name suggests is a description of nodes. The resolution of this description results in a set of nodeIDs and a way to reach them (i.e., routing). In its simplest form the specification could just be a node ID of a neighbor (no need to do any routing, just transmit to that particular neighbor). A more advanced specification would be a remote node ID along with the routing protocol used to reach it). An even more advanced specification is a set of attribute-described nodes (e.g., a geographical region in the form (point, radius) along with a geographical protocol to interpret the attributes and perform the routing.

Mailbox is a tuple needed to describe a specific instance of a script. It has the string form of: userID.codeID.appID. UserID is just the user’s id that spawned the script in question. CodeID is an id describing the script. AppID is an id denoting the application (i.e. collection of scripts) that the particular instance of a script belongs to. It is also used to distinguish instances of the same script (same codeID) running under the same node and under the same user (but for different applications).

As hinted from the above paragraphs addressing and routing are interwoven. The format of addressing is fixed but allowing for multiple routing protocols. Routing can take multiple facets in a WASN (e.g. directed diffusion, geographical routing, energy aware unicast, multicast to members of a cluster, etc). All these examples can be used by different applications or even by the same application according to circumstances. Furthermore, many applications can use their own custom-made routing, or more frequently, no routing at all, as they are restricted to purely 1-hop local interaction (e.g., the aggregation application we describe in the paper). Thus, SensorWare needs to provide a way to easily export the functionality of multiple routing protocols to the scripts and allow the easy insertion of new routing protocols at SensorWare compile-time or even run-time. The nodes_specification parameter that ties routing with node attribute is a step towards this direction. We have not seen yet the way routing protocols are represented in SensorWare. The clearest way is to define routing protocols as devices in SensorWare. Thus, nodes_specification becomes (<device_name>, <attributes>). When a SensorWare command that uses nodes_specification as one of its arguments (e.g. the send command or the spawn command) is called, a special function of the device is called to handle the routing part of the command. That function is similar in nature with the act query interest and dispose function the device has to define in order to provide the interface to the corresponding SensorWare commands. Furthermore, in order to support application-level routing and run-time update of routing protocols SensorWare has the ability to define dynamic devices in the form of scripts as we will see in subsection 4.1.

2.5.2 State passing with code

Another important feature is the ability to pass state (i.e. data) along with the code of the scripts. This feature is certainly needed if we want to create capable agents. Initially, this was achieved by passing some variables of the executing script to the newly spawned script by name. The name and the values of the variables were transferred along with the packed code. The variables were re-instantiated on the remote node. Although this method has a very simple usage (i.e., the programmer just appends the variable name after the code in a spawn, replicate, or migrate command) it creates problems for multiple collaborating scripts that were written by different authors. The passing by naming requires the transferred variables to be unique in the clique of scripts that use them, a fact not easily achieved when one considers multiple large scripts written by different programmers at different times. Currently, SensorWare supports a more friendly way of passing state between scripts. It adopts passing parameters by value much like functions in C. We introduce two new commands to achieve this: parameter and carry. Parameter declares the externally defined state of the script, in the form of a list. Each list element is a variable name that gets initialized by an externally provided argument. The list elements can be used as any regular tcl variable inside the script. Carry performs the complementary operation. It gives the values of the argument list that is to be carried with the next spawn, replicate, or migrate command. These are the externally provided arguments that will initialize the variables in the parameters list. Better see how all this works with an example. Here is a simple script:

#code_id 51

parameter a b c

set b 7

debug a= $a b= $b c= $c

Suppose that the variable test_script holds the code of the above script. Executing the following code in node 1:

set z dog

carry $z cat [eval 3+3]

spawn neighbor 1 $test_script

will yield

debug (1.1.51.2): a= dog b= 7 c= 6

Notice that the carry command carried the 3 arguments "dog cat 6" initializing the variables a b and c in the script. Variable b though was set to another value before executing the debug command hence the different result.

The above discussion introduced an interesting notion: "Suppose that the variable test_script holds the code of the above script". That would be nice. How can we do it easily? In SensorWare there is a command called load that takes a variable name as a first argument and a file as a second argument and assigns the contents of that file (if it is a script) to that variable name. Assume that the script discussed above was stored in file /home/boulis/test_script1.txt then the following command puts it in a variable.

load test_script /home/boulis/test_script1.txt

Finally, when replicate or migrate are used the carry does not have to be explicit. If a carry is not provided and the script has a parameter list defined, a hidden carry is executed.

If the script has parameters p1 p2...pn then the command carry $p1 $p2...$pn is implicitly executed before the replicate of migrate.

2.6 Other SensorWare features

2.6.1 Predefined variable names and event names

One easy thing that SensorWare can do to make programming easier and more compact is to predefine some Tcl variables. By doing so, common state of the node or the script environment is made easily available without the need to execute script commands to acquire it. We have already encountered “event_name”, which contains the name of the event that caused the wait command to return. Another variable name is “event_body”, which contains the main body of the event (e.g., sensing data, body of a data packet).

Here are the rest of the predefined variable names and their meaning:

parent_node (holds the node id that spawned the current script into the current node)

msg_src_node (holds the node id of the script that sent the latest network message)

msg_src_user (holds the user id of the script that sent the latest network message)

msg_src_app (holds the app id the script that sent the latest network message)

Apart from variables we can also predefine some common events. Currently SensorWare has only one predefined event, namely packet. Packet describes the event of a network message reception, i.e., any message received through any radio.

2.6.2 Script compression

Since SensorWare scripts take the form of Tcl scripts they are initially written in an ASCII format, using the full mnemonic name for commands and variables. These text files are directly executable (by interpretation) in any node running SensorWare. One could compress the files from the original text form in order to save communication energy. By doing so, a computation overhead is created to perform the decompression in every node that executes these scripts (the compression will be performed once at the backend node and the compressed script is going to be carried around the network). One wants to make sure that the communication savings are as much larger than the computation overhead. This kind of structured text files, that is files that follow a language syntax, are amenable to custom compression that can achieve better compression rates than generic compression algorithms. Furthermore, the decompression can be made fast by exploiting the same principle.

We have devised such a custom compression/decompression algorithm for SensorWare that we call semC (from semantic compression). SemC, apart form using a symbol table to replace commands, reserved words and reserved variables with single bytes it uses semantic information to identify special places in the script and apply a kind of lossy compression. The total information of the text is not kept but the result of the execution of the script is kept intact. This is achieved by compressing down the variable names from their mnemonic names and by not keeping the initial names into the compressed form of the script. The mnemonic names are not really needed once the script is executing inside the network so there is no inaccuracy in the script’s result. The compression is also made in such a way so that the decompression is especially fast, since this is the operation that will occur many times within the network.

Although the semC code is included in the current Sensorware distribution, it is not in use yet. Currently zlib is used for code compression purposes.

3 Code Examples

Having reviewed all the basic concepts, it is time to look at more complex code examples. We chose a geographical routing task to act as our application for illustrative purposes. The application is the following: We wish to spawn a certain script in a geographic area, defined by a point in space and a radius. The script will do its job (in our case it will just send a message to the node’s debugging device) and then send a message back to the user.

We provide code for two versions of the application:

1) Implementation based on a native service/device providing point (not area) geographical routing. Other basic services/devices are used.

2) Implementation based only on the basic native devices neighbor and location.

In the section 4 we will see a third version of the application's implementation based on a new SensorWare feature.

3.1 Geo-routing using the gpsr device

We begin with the listing of the script we wish to spawn in a geographical area. Since we are using this script only to illustrate the abilities of our application, it remains fairly simple. As already mentioned, all scripts start with a tcl comment line (i.e., a line beginning with the character ‘#’), that begins with the word “code_id” and is followed by a number. This line is used by the spawn command (a command used to spawn a new script) to assign a code id to the script. After the number we can have more characters, which are ignored by the spawn command. We use this space to give a human readable name to the script. None of the comment lines are transferred with the code around the network. The first SensorWare command of this specific script is parameter, which gives the names of the two script parameters (i.e., variables that are externally set).

#code_id 65 populated_script

# to be used with gpsr test

Slash to ignore following newline

(i.e., no new tcl command starting)

node specification               mailbox specification

message

parameter user_x user_y

debug "POPULATED SCRIPT: I am here"

send gpsr $user_x $user_y [id -u].63.[id -a] \

"nodeid [id -n]"

Listing 1: The script we wish to populate in an area

Notice that send is using the device gpsr (the device implements the well-known protocol Greedy Perimeter Stateless Routing) along with two parameters (i.e., the x and y coordinates of the destination point) to give a <nodes_specification> as discussed in the previous section. The message is delivered to script with id 63 (we will see shortly which is this script). The id command, used inside send, returns different identifying values, such as node_id, user_id, code_id, and app_id, depending on the command’s arguments.

As mentioned already, the device gpsr routes to a single target point and not an area, as our application requirement dictates. Nevertheless, we could use gpsr to provide routing to the center of the area we wish to route and then use another script to flood an area of radius r around this point. This is the functionality of the script given in Listing 2. The parameters of this script are: the values that define the spawn area, the script-to-be-spawned (an interesting feature of SensorWare is that scripts can be treated as any other variable), and the user coordinates since the script-to-be-spawned needs them (as we saw in Listing 1). The script in Listing 2 defines a procedure to calculate the squared euclidian distance between two points and uses it to determine whether or not it should flood.

#code_id 64 radius_flooding2

# to be used with gpsr test

parameter target_loc_x target_loc_y radius \

populated_script user_x user_y

proc dist {x1 y1 x2 y2} {

set diff_x [expr "$x1 - $x2"]

set diff_y [expr "$y1 - $y2"]

return [expr "($diff_x * $diff_x) + ($diff_y * $diff_y)"]

}

set my_loc [query location]

set my_loc_x [lindex $my_loc 0]

set my_loc_y [lindex $my_loc 1]

set dist_squared [dist $my_loc_x $my_loc_y \

$target_loc_x $target_loc_y]

#****************************************************************

# We check if we are inside the area that must be flooded.

# if yes, we replicate in all neighbors and spawn the

# populated_script

#****************************************************************

if {($radius < 0) || ([expr $dist_squared-$radius*$radius] <= 0)} {

debug "flooding into neighborhood"

replicate neighbor 0

carry $user_x $user_y

spawn self $populated_script

}

Listing 2: The script that floods an area of radius r around a target point

Listing 3 presents the code that uses the gpsr device and the flood script to achieve the application’s goal. This is the script with code id 63 (i.e., the script that the replies are routed back to). This script needs to be spawned only in the user node. Its main function is to use the gpsr device to route the flooding script to the center of the target area. Then it simply waits for replies from the spawned scripts.

#code_id 63 gpsr_test

parameter target_loc_x target_loc_y radius \

populated_script radius_flood_code

set my_loc [query location]

carry $target_loc_x $target_loc_y \

$radius $populated_script \

[lindex $my_loc 0] [lindex $my_loc 1]

spawn gpsr $target_loc_x $target_loc_y \

$radius_flood_code

while {1} {

wait packet

debug "Arrived at user: $event_body"

}

Listing 3: The script that routes to an area

Finally we present the invocation script in Listing 4. This is not a SensorWare script in the sense that it does not have a code id and does not roam around the network. It simply loads all the relevant SensorWare scripts and sets the parameters to the desired values. The user simply executes an invocation script at the swsh (by typing: source <script_name>) and the distributed application is deployed.

# This is an invoke script.

# Just source it in he swsh to start the app.

# Populates script in a geographical area.

# Uses native service GPSR

set path /home/boulis/scripts/geo/GPSR

set target_loc_x 45

set target_loc_y 37

set radius 8

load populated_script $path/populated_script2.txt

load radius_flood $path/radius_flooding2.txt

load gpsr_test $path/gpsr_test.txt

carry $target_loc_x $target_loc_y $radius \

$populated_script $radius_flood

spawn self $gpsr_test

Listing 4: The invocation script

3.2 Geo-routing using only basic devices

Imagine that the gpsr device is not included in the package of native services offered by SensorWare. This for example, might be true if we decide to reduce the footprint of SensorWare by reducing some of the implemented devices. Or simply imagine that a programmer, finding the service of gpsr inadequate (since it only routes to a point), wants to implement the application from scratch. This is the case of the second version of the geo-routing application. The first change is noted to the populated script. Since there is no gpsr device to route replies back to the user, we have to use another service. This is the updated script:

#code_id 66 populated_script

debug "POPULATED SCRIPT: I am here"

send self "[id -u].73.[id -a]" "nodeid [id –n]"

Listing 5: The modified script-to-be-populated

The message "nodeid [id –n]" is now routed to another script in the same node. The recipient script is a user relay script that keeps the next node towards the user. Following is its listing:

#code_id 73 relay

parameter relay_node

while {1} {

wait packet

if {$relay_node == [id -n]} {

debug "ARRIVED USER: $event_body"

} else {

send neighbor $relay_node

[id -u].[id -f].[id -a] $event_body

debug $event_body

}

if {[lindex $event_body 0] == "Error:"} {

debug "EXIT relay"; exit}

}

Listing 6: The user relay script

Following is the listing of the new flooding script. Its code is very similar to the corresponding script of the first version with the addition that now it has to spawn the relay script too. This includes setting the parameters of the relay script. The single parameter of the relay script is set differently depending on whether the script was created through self-replication or another script spawned it.

#code_id 74 radius_flood

# uses relay

parameter target_loc_x target_loc_y radius \

populated_script relay_code relay_node

proc dist {x1 y1 x2 y2} {

set diff_x [expr "$x1 - $x2"]

set diff_y [expr "$y1 - $y2"]

return [expr "($diff_x * $diff_x) + ($diff_y * $diff_y)"]

}

set my_loc [query location]

set my_loc_x [lindex $my_loc 0]

set my_loc_y [lindex $my_loc 1]

set dist_squared [dist $my_loc_x $my_loc_y \

$target_loc_x $target_loc_y]

if {($radius < 0) || ([expr $dist_squared-$radius*$radius] <= 0)} {

debug "flooding into neighborhood"

replicate neighbor 0

# we have to set correctly the parameter to relay_code

if {$parent_node == [id -n]} {

# this script was spawned through

# geo_routing_flooding

carry $relay_node

} else {

# this script was spawned through self-replication

carry $parent_node

}

spawn self $relay_code

spawn self $populated_script

}

Listing 7: New flooding script

Following is the script that achieves the geographical routing with the help of the flooding script. The comments in the code offer an explanation of the script’s basic functions.

#code_id 70 geo_route

# uses radius_flood relay ask_loc

parameter target_loc_x target_loc_y radius \

populated_script radius_flood_code \

relay_code ask_loc_code

proc dist {x1 y1 x2 y2} {

set diff_x [expr "$x1 - $x2"]

set diff_y [expr "$y1 - $y2"]

return [expr "($diff_x * $diff_x) + ($diff_y * $diff_y)"]

}

set my_loc [query location]

set my_loc_x [lindex $my_loc 0]

set my_loc_y [lindex $my_loc 1]

set dist_squared [dist $my_loc_x $my_loc_y \

$target_loc_x $target_loc_y]

if {($radius < 0) || ([expr $dist_squared-$radius*$radius] <= 0)}

{

carry $target_loc_x $target_loc_y $radius \

$populated_script $relay_code \

$parent_node

spawn self $radius_flood_code

exit

}

# **************************************************************

# Query each neighbor to find their location. We broadcast

# this script since most nodes will have more than one

# neighbor. Note that we clear the queue before we start

# (by waiting on nothing)

# **************************************************************

wait

spawn neighbor 0 $ask_loc_code

# **************************************************************

# We wait for some time to hear back from the neighbors and

# choose the best candidate to proceed

# (node closest to target point)

# **************************************************************

interest timer t 2000

while {1} {

wait packet t

if {$event_name == "t"} { break }

set tmp_x [lindex $event_body 0]

set tmp_y [lindex $event_body 1]

set neigh_dist_squared [dist $tmp_x $tmp_y \

$target_loc_x $target_loc_y]

if {$neigh_dist_squared < $dist_squared} {

set dist_squared $neigh_dist_squared

set best_node $msg_src_node

}

# **************************************************************

# We need to set up a relay agent to get packets back to the user

# ***************************************************************

carry $parent_node

spawn self $relay_code

# **************************************************************

# Check if a better node was actually found. If yes proceed

# with the replicate (and exit), otherwise send an error

# message

# to the relay agent (and exit)

# ***************************************************************

if {$best_node == 0} {

debug "no easy route to target"

send self "[id -u].73.[id -a]" \

"Error: no route, stop at node $my_addr"

} else {

replicate neighbor $best_node

}

Listing 8: The script that routes to an area

It is interesting to see how the script in listing 8 gets the location information from the neighboring nodes. From listing 8 we can see that a script stored in the variable “ask_loc_code” is used. Here is its single line code:

#code_id 71 ask_loc

send neighbor $parent_node [id -u].70.[id -a] [query location]

Listing 9: The script that retrieves the location information

The invocation script is similar to the corresponding script of the first version, thus it is not listed here. It is interesting to note that although we did not use any high-level services we were able to implement our geographical routing application, paying the price of larger scripts. Note also that the relay script and ask_loc script could be omitted since there are native SensorWare services that cover them (the service/routing protocol named "user" essentially implements the relay script natively and a feature in the location service can return the location of neighbors) but abiding to the no-high-level-services rule we chose not to use them.

4 Advanced Features and more Examples

4.1 Dynamic device registration (scripts providing services)

SensorWare scripts use the services provided by the devices to build up distributed algorithms. As we have seen so far these devices are implemented in native code and provided along with a specific porting of SensorWare to a platform. Thus, they are fixed. One could easily see the benefit of using on-the-fly-services. Since SensorWare is built for dynamic algorithm deployment one might argue that the capability of on-the-fly services is already present. Simply define a script that implements the new service and provide a custom way for other scripts to access the new service through regular message passing between the scripts. Apart from the possible conflicts that such a method might cause (since the interfaces are arbitrary and defined by different programmers), certain types of message passing require extra effort from the programmer and appear awkward (e.g., passing a script’s own code to a dynamic routing service for population).

It would be better if we had a standard way of interacting with dynamic services implemented by scripts. The most clear and natural way to achieve this is to use the standard device interface and treat such dynamic services as yet another device. This calls for methods to allow dynamic device registration and the ability to register scripts as devices. We have implemented dynamic device registration in SensorWare, and a script should include the following if it wishes to be registered as a device:

· Tcl procedures that implement some or all of the five basic functions a device implements (i.e., the four functions that process the commands act, query, interest, dispose, plus the function that implements routing when send or spawn are called). These procedures must be named using the special names sdev_query, sdev_act, sdev_interest, sdev_dispose, and sdev_spawn.

· A device command followed by the name of the new device. The command must appear after the script has defined its special procedures and has done all initializations. After the new device is registered it is ready to accept service requests.

4.2 Geo-routing using a script device

Suppose that we have many requests for routing to several geographical areas. In the second implementation version (i.e., the one using only two basic devices), we developed code that met the needs of a geographical routing request. It would be desirable if other scripts could use this code as a resident service. We can achieve such a dynamic instantiation through the use of the script devices. This is the third version of our implementation. We first use the code from the second version to define a new script device called "GEO_script". This modification keeps most of the previous code and simply rearranges it in a form of a device description (i.e., defining the special device functions and the device task) and registers the new device under the name GEO_script.

#code_id 70 geo_route

# uses relay ask_loc

# ***********************************************************

# we do not need parameters to specify the region we want

# to route, these will be provide in the meesages we receive

# we do not need to spawn code for radius_flood

# Since all the nodes will run the same algorithm, we

# include the flood code in the main script body.

# only the relay code and ask_loc code are still useful

# ***********************************************************

parameter relay_code ask_loc_code

debug "device initializing..."

set my_addr [id -n]

set my_loc [query location]

set my_loc_x [lindex $my_loc 0]

set my_loc_y [lindex $my_loc 1]

# this is needed to keep track of the messages received

# already and avoid loops in message passing

set msg_bank(0) ""

set msg_bank(1) ""

set msg_bank(2) ""

set bank_pointer 0

# *************************************************************

# We need to set up a relay agent to get packets back to the.

# user (Remember, we do not use any routing with this example)

# ***************************************************************

carry $parent_node

spawn neighbor $my_addr $relay_code

# **************************************************************

# Query each neighbor to find their location. We broadcast this

# message since most nodes will have more than one neighbor.

# Note that we clear the queue before we start.

# **************************************************************

wait

spawn neighbor 0 $ask_loc_code

# ***************************************************************

# We wait for some time to hear back from the neighbors and

# store the result in an array for easy future access

# ***************************************************************

interest timer t 2000

while {1} {

wait packet t

if {$event_name == "t"} { break }

lappend neigh_list $msg_src_node

set neigh_loc_x($msg_src_node) [lindex $event_body 0]

set neigh_loc_y($msg_src_node) [lindex $event_body 1]

}

proc dist {x1 y1 x2 y2} {

set diff_x [expr "$x1 - $x2"]

set diff_y [expr "$y1 - $y2"]

return [expr "($diff_x * $diff_x) + ($diff_y * $diff_y)"]

}

proc sdev_spawn { argList } {

# *** argList format is: target_loc_x target_loc_y radius

# *** packet_to_be_sent

#we use the following variables from the script's main body

upvar my_loc_x x my_loc_y y my_addr id \

neigh_list n_list neigh_loc_x n_x neigh_loc_y n_y

# ******** parse the argList *******

set target_loc_x [lindex $argList 0]

set target_loc_y [lindex $argList 1]

set radius [lindex $argList 2]

set main_body [lindex $argList 3]

set dist_squared [dist $x $y $target_loc_x $target_loc_y]

# ******** flooding case *********

if {($radius < 0) || \

([expr $dist_squared-$radius*$radius] <= 0)} {

# append a flooding_flag = 1

lappend argList "1"

send neighbor 0 "[id -u].[id -f].[id -a]" $argList

spawn neighbor $id $main_body

# **** success flooding case ****

return 0

}

# ******** forwarding case *********

set best_node 0

foreach i $n_list {

set neigh_dist_squared [dist $n_x($i) $n_y($i) \

$target_loc_x $target_loc_y]

if {$neigh_dist_squared < $dist_squared} {

set dist_squared $neigh_dist_squared

set best_node $i

}

# ********************************************************

# Check if a better node was actually found. If not send

# an error message to the relay agent

# ********************************************************

if {$best_node == 0} {

puts "no easy route to target"

send neighbor $id "[id -u].relay.[id -a]" \

"Error: no route, stop at node $my_addr"

#**** error forwarding case *****

return 1

} else {

# append a flooding_flag = 0

debug "best node = $best_node"

lappend argList "0"

send neighbor $best_node "[id -u].[id -f].[id -a]" \

$argList

#**** success forwarding case *****

return 0

}

# ************************************************************

# replicate the device code throughout the network

# ************************************************************

replicate neighbor 0

# ***************************************************************

# Finally after the initializations and just before the main

# wait loop we register the device

# ***************************************************************

debug "DEVICE about to register"

device GEO_script

debug "DEVICE registered"

while {1} {

wait packet

# packet format is: target_loc_x target_loc_y radius

# packet_to_be_sent flooding_flag

set target_loc_x [lindex $event_body 0]

set target_loc_y [lindex $event_body 1]

set radius [lindex $event_body 2]

set main_body [lindex $event_body 3]

set flooding_flag [lindex $event_body 4]

set dist_squared [dist $my_loc_x $my_loc_y $target_loc_x $target_loc_y]

if {($radius < 0)||([expr $dist_squared-$radius*$radius] <= 0)} {

# check if flooding flag needs to be flipped from 0 to 1

if {$flooding_flag == "0"} {

set event_body [lreplace $event_body 4 4 "1"]

}

# check if the message has already passed from this node

if {$msg_bank(0) == $event_body} {debug FOUND0; continue}

if {$msg_bank(1) == $event_body} {debug FOUND1; continue}

if {$msg_bank(2) == $event_body} {debug FOUND2; continue}

# the message is new: update the msg bank

set msg_bank($bank_pointer) $event_body

incr bank_pointer

if {$bank_pointer == 3 } {set bank_point 0}

send neighbor 0 "[id -u].[id -f].[id -a]" $event_body

# here we should check if we have a spawn or send msg

# now only spawn is implemented

spawn neighbor $my_addr $main_body

#try to route only if we haven't reached the flooding phase

} elseif {$flooding_flag == "0"} {

set best_node 0

foreach i $neigh_list {

set neigh_dist_squared [dist $neigh_loc_x($i) \

$neigh_loc_y($i) $target_loc_x $target_loc_y]

if {$neigh_dist_squared < $dist_squared} {

set dist_squared $neigh_dist_squared

set best_node $i

}

# *******************************************************

# Check if a better node was actually found. If not send

# an error message to the relay agent

# *********************************************************

if {$best_node == 0} {

puts "no easy route to target"

send neighbor $my_addr "[id -u].relay.[id -a]" \

"Error: no route, stop at node $my_addr"

} else {

debug "best node : $best_node"

send neighbor $best_node "[id -u].[id -f].[id -a]" \

$event_body

}

# end of elseif

}

# end of while {1} loop

}

Listing 10: The script device

The device script is pre-deployed throughout the network using the following invocation script:

# This is an invoke script.

# just source it in the terminal to start the app.

# Deploys a device-script in the network.

set path /home/boulis/scripts/geo/script-device

load geo_ routing $path/geo_routing_flooding_service.txt

load ask_loc $path /ask_loc.txt

load relay $path /relay.txt

carry $relay $ask_loc

spawn neighbor [id -n] $geo_routing

Listing 11: Invoking the script device

Then other scripts can use this new device. Listing 9 shows such a script, using the newly instantiated script device.

#code_id 63 test GEO_script service

parameter target_loc_x target_loc_y radius populated_script

spawn GEO_script $target_loc_x $target_loc_y \

$radius $populated_script

while {1} {

wait packet

debug "Arrived at user: $event_body"

}

Listing 12: A script using the new device/service

5 Complete API and device functionality

SensorWare supports Tcl syntax and the following 41 Tcl commands: append, array, break, case, catch, concat, continue, error, eval, expr, for, foreach, format, global, if, incr, info, join, lappend, lindex, linsert, list, llength, lrange, lreplace, lsearch, lsort, proc, regexp, regsub, rename, return, scan, set, split, string, trace, unset, uplevel, upvar, while.

There are 18 other commands defined by SensorWare that essentially abstract the node's run-time environment. They are:

spawn <nodes_specification> <code>

replicate <nodes_specification>

migrate < nodes_specification>

send nodes_specification <userID>.<codeID>.<appID> <message>

query <device_name> [ var_arg... ]

act <device_name> [ var_arg... ]

interest <device_name> <eventID> [ var_arg... ]

dispose <device_name> <eventID>

wait <event_name>...

carry value …

parameter <variable_name>…

device <device_name>

id [-n | -u | -f | -a] (outputs the node userID codeID and appID info)

random (outputs a random number in [0..1])

and the helper commands: ps (outputs the active scripts in the current node),

ls (outputs the current device list)

debug value (output value string in the debugging terminal)

load <variable_name> filename (used from a terminal, loads the script code contained in filename into variable_name)

Legend: [ ] indicates optional, < > indicates a variable (either a Tcl variable or a SensorWare variable such as an event_name), the suffix "_list" in variable names indicates that the variable is a list (i.e., zero or more elements). The symbol "var_arg ..." indicates variable arguments. The modifier "..." indicates a list of arguments of the preceding argument type. Nodes_specification is a device name that implements a routing protocol along with its arguments. Value is a string, after all Tcl substitution are made in the command.

There are 6 reserved Tcl variable names. These are: parent_node, event_name, event_body, msg_src_node, msg_src_family, msg_src_app. There is one reserved event name: packet.

The following table summarizes the use of the 13 useful devices.

	query	act	interest	dispose	routing (spawn or send)
light	value from virtual sensor
location	node’s x,y coordinates
timer			defines a timer (name & value)	disposes timer
time	absolute time
self					Routes to self
flood					Broadcasts n hops away
*user	“heard” users				routes to user n
neighbor	neighbors list				routes to neighbor n
gpsr					routes to x, y point
voronoi	voronoi area
measure	null operation	null operation	null operation	null operation	null operation
monitor	bytes transmitted by node				Routes to x,y. point. Bytes not counted.
mbox	# of events in event queque

*not operational yet.