Pony capabilities
Pony is a programming language that allows concurrent programming using actors. The language features an intelligent type system that prevents data-races trough a system called “capabilities”. In this blog post I will attempt to explain them.
Let’s start with the definition of a capability in pony:
A capability is an unforgeable token that
- designates an object and
- gives the program the authority to perform a specific set of actions on that object.
In other words with a capability you can do anything with the object. You can think of the capability as being the object itself.
So how do capabilities help us with concurrency problems? Well, they don’t, at least not on their own. The thing that makes capabilities useful is limiting how they are used. In Pony, the type system limits on how you can use these unforgeable tokens. When working with actor based concurrency, these limitations allow us to make some useful guarantees.
References
The objects that are created in pony are stored in a space that is accessible by all actors. An actor can only change an object trough a reference to the objects’ capability. When an object is created, a reference to the capability of the newly created object is returned.
val a : Object = Object.create()
After executing the above, a
holds a reference to the newly created object. In pony references allow the holder one of the following functionalities:
M
: If the reference points to an actor, the reference can be used to send messages to itRM
: The referenced object can be read, and messages can be sentRWM
: The referenced object can be read form, written to, and messages can be sent
Aliases
Let’s look at what happens if we clone a reference:
val b : Object = a
Now b
references the same object that a
was pointing to. We say that a
and b
are aliases, they reference the same object. Aliases can cause problems during parallel computation. Race conditions can occur when aliases live on different actors. Consider the following example:
Actors
A
andB
are counting the number of ponies on a farm. Both actors work on a different field, and increment a counter each time a pony is found. The example below shows what happens when both actors find a pony at the same moment, and increment the count by looking at what the current value of the counter is, adding one and storing it. The result is wrong.actor A a,b actor B | 0 | a' = a (0) 0 b' = b (0) Both read 0 as counter value | 1 b = b' + 1 (2) B overwrites content with 1 a = a' + 1 1 | A overwrites content with 1 | 1 | The count is 1 (not 2)
When aliases are created
Apart from doing, b = a
, aliases are also created for every parameter in a function call, and for the this
in every method call.
So,
object.f(a,b,c)
Creates an alias for object
, a
, b
and c
.
Managing aliases
As we will see, ponies type system strongly relies on counting the number of aliases of a certain kind. For this reason it will come in handy to be able to move a capability over to another variable, without aliasing it. There are two ways to move a capability.
-
Destructive read: In pony the result of value of the expression
a = b
is the old value ofa
. By doing the following, the number of aliases to the locationb
points to in the beginning remains unchanged.a = b = ...
a
now refers to whatb
used to refer to, andb
refers to whatever the result of...
is. -
Consuming the variable: This takes the value out of the variable you gave it. The given variable is now empty (think of it as
null
). The type checker will prevent you form using a consumed variable.consume b
This is effectively the same as the following destructive read (which is not valid pony code):
b = null
Reference capabilities
Tag
To avoid concurrency problems, we could simply say that it is forbidden to read or write trough a reference. This is what the tag
reference capability does. You are allowed to make as many aliases of a tag as you like, you can store them and you can compare two tag
variables for identity.
Aliases of a tag
can safely live on different actors as they cannot be read form or written to. The only thing you can do is send messages to it (if is an actor). These messages will then be handled sequentially by the actor in the order in which they arrived.
Value
In some cases you really only need to read the referred information. The only way that we can safely read data at the same location form multiple actors, is when we know the data is immutable. The main principle is:
If I can read, no other actor can write.
The val
reference capability guarantees something even stronger, that no actor has write permissions. A val
can only be used to read or to send messages, never to write.
There may also exist tag
(or box
) aliases of a val, but that’s fine because they too don’t allow writing.
Isolate
So, now that we know how to read safely, we want to write safely. Mutable data must reside on one actor (thread). There must not be a reference through witch data can be read by another actor. This is where the iso
reference capability comes in. It stands for “Isolate”. An iso
variable must not have any other alias (of any kind) to it, or to an internal part of it. It is isolated, only accessible form the outermost layer and exactly once.
Let’s look at an example.
If you have a Car iso
reference capability that is stored in car
, you cannot do the following:
weels : Wheels iso = car.wheels # Won't work
Because if you could, there would be two aliases to the memory location of car.wheels
(wheels
and car.wheels
).
The benefit of these strong restrictions is that you know there is exactly one reference a piece of memory referred to by an iso
. There are no aliases. if you give up your local alias you can pass it on to another actor. As you remember, giving up an alias can be done with consume
. consume car
is of type Car iso^
, here the ^
modifier on the type indicates that there is no reference to the value. The reference is ephemeral (short-lived, it has no alias). When there is no alias it is safe to send it to another actor using a give_automobile(Car iso car)
behaviour.
otherActor.give_automobile(consume car)
Now only otherActor
has access to the car in memory. The original owner is no longer allowed to use car
by the type checker.
This is cool, by using an iso
we ensure that there is only one reference to that piece of memory. No aliases are allowed. But an iso
is very restrictive. As you recall form the previous section, it is very easy to create aliases. Using an iso
as a argument to a function will not work without consuming the alias. If we don’t plan on sharing a reference, an iso
is far to restrictive. This is where the ref
reference capability comes in.
Reference
The ref
reference capability is the most permissive read-write capability. It permits as many aliases as you like as long as all the aliases with read and/or write capabilities are on the same actor. There may still be tag
(no read, nor write) aliases on other actors. The principle is:
If I can write, no other actor can read.
A reference with the ref
reference capability is like the variables you are used to from for example Java. The only thing you can’t do with it is send it to an other actor. That is, a ref
reference capability is not sendable. The sendable reference capabilities are iso
, val
and tag
.
Transition and box
A more flexible variant of the iso
that allows reading and writing is the trn
(transition). This reference capability is designed to create a read-only variable (a val
). As opposed to the iso
a trn
may have read aliases. But these aliases must remain on the same actor. In other words, these aliases must not be sendable. Luckily the type system of pony has a reference capability that is just that: box
.
Summary
The following “deny matrix” summarizes the aliases that are forbidden for each of the reference capabilities. RW means read and write access, W means write access. The upper right corner is empty because it is not possible to deny more on your actor (local) than on other actors (global). On the diagonal, we find the sendable reference capabilities, they have the same restrictions local and globally.
other global aliases | ||||
---|---|---|---|---|
no RW | no W | all | ||
other local aliases | no RW | iso | ||
no W | trn | val | ||
all | ref | box | tag | |
mutable | immutable | opaque |
Syntax
So how do you assign a reference capability to a variable? Well, you add the name of the reference capability to the end of the type name:
val myCar : Car iso = ...;
val sharedPicture: Picture val = ...;
Here myCar
is an iso
reference to a Car
and sharedPicture
is a val
reference to a Picture
.
Default capabilities
Often the default reference capability of an object can be derived from the “meaning” of the class. Therefore, when you define a class you can specify a default reference capability:
class val Picture:
...
And then just use:
val sharedPicture: Picture = ...;
Capabilities of methods
In a method you can make use of this
, which refers to the current instance of the class. Remember that calling a method, an alias is created for this
. By default, the reference capability for this is ref
, but you can change it by adding a reference capability to your function definition:
class iso Car:
fun ref doIt() => ...
fun val doThat() => ...
Recovering capabilities
Recover: Lifting your capabilities
You might have a ref
and know that there is only one reference to it. You want to turn your ref
into an iso
. The type checker would not let you undertake this, unless you can prove everything is fine. The recover
-expression is a mechanism to provide such a proof. Form within a recover expression you can access all sendable variables form the enclosing lexical scope (iso
, val
and tag
). You can do complex things with it and get out an iso
if your expression evaluated to any mutable reference capability.
val thing : Thing iso = recover
... complex stuff ...
aRefThing
end
If your expression evaluates to an immutable, you can get out a val
. And tags stay tags.
So you can recover an
iso
form{iso,trn,ref}
(preserve mutability)val
form{val,box}
(preseve immutability)tag
form{tag}
(preserve opacity)
Receiver recovery
When you call a method, there is an alias to this
. As a consequence, we should not be able to call non-tag
functions on an iso
. Unless we use what we just learned and write the following:
var obj : Object iso= Object
var returnValue : String val= ""
obj = recover
val refObj : Object ref = consume obj
returnValue = refObj.toString()
refObj
end
We consume our iso
and make it into a ref
, do what we want to do and recover our original object with the iso
reference capability. Luckily we don’t need to do this as pony will do this for us automatically in a process called Automatic receiver recovery.
Subtyping
There is a hierarchy to reference capabilities. If a function expects a ref
you can give it an iso
or a trn
(if you consume it), after all you have write permissions. A box
may have at most one writeable alias, a val
may have none, therefore a function that expects a box
will happily accept a val
.
Apart from using generic functions, you can also use subtyping to just throw away your capabilities:
val myRefCar = consume myIsoCar
Receiver reference capabilities
Reference capabilities are checked when you are trying to access the value or the fields of a variable. If you have an iso
variable you can read any of its iso
or val
fields. All other types of fields are read as a tag
.
At first, that seems odd, you might be wondering “If I have an iso
why can’t I read its ref
field?” The reason is that an iso must maintain the property that it is isolated and that there is thus no other alias that can read or write to that memory. This includes its fields. If you were able to make an alias to one on the ref
fields of an iso
variable, you could still read from and write to the internals of the iso
trough the alias of this ref
field even if you passed the iso
to another actor. The same holds for trn
and box
fields. val
Fields are fine because they are immutable, and they are always safe to read.
The following table summarizes the restrictions. The row indicates your capability on an object, the column specifies the capability the object itself has on the filed you are trying to access. Because you can’t read fields form a tag, that row only contains “n/a”.
▷ | iso field | trn field | ref field | val field | box field | tag field |
---|---|---|---|---|---|---|
iso origin | iso | tag | tag | val | tag | tag |
trn origin | iso | trn | box | val | box | tag |
ref origin | iso | trn | ref | val | box | tag |
val origin | val | val | val | val | val | tag |
box origin | tag | box | box | val | box | tag |
tag origin | n/a | n/a | n/a | n/a | n/a | n/a |
When you are calling a method on an object, the restrictions from the call-site still need to hold. You can’t call the method setRefField(...)
on an iso
variable. For this reason functions are annotated with a receiver reference capability. You can only call a method that is compatible with your capabilities on the object. The default receiver reference capability of a method is box
.
Refcap recap
Reference capabilities guard the amount of references there are to a certain piece of memory.
A variable is a pointer to an object. Or to be more precise a variable references a capability. When creating a variable, you need to assign a reference capability to it. One of iso
, trn
, var
, ref
, box
or tag
. The reference capability you choose is your promise to the compiler that states how you will use this variable. The pony compiler will strictly uphold you to your promise.
In the following overview R means read rights, RW means read and write rights.
iso
= Isolate. (RW, no aliased, can be passed) : If you have an iso
variable, this means that you are certain that there is no other alias to that piece of memory with read (or write) access. It is safe to read, and write to this variable. You can pass an iso
to another actor if you also pass the ownership by using consume
.
trn
= Transition. (RW, may have R aliases, cannot be passed) : A trn
variable is designed to create a read only variable. Having one allows you to make edits to its contents and allows you to create read only variants of it (box
es). These read-only variants can’t be passed to other actors but may come in handy when constructing your read only data. For example when you are creating a val
data structure with cyclic references.
box
= Box. (R, may have R and RW aliases, cannot be passed) : This box should be thought of as transparent with a slot for messages. You can read its (internal public) data but you cannot alter its state trough function calls. You can still send messages to it if it is an actor
ref
= Reference (RW, may have RW aliases, cannot be passed). : A ref
variable is your default reference capability. It states that you can modify the data the variable is pointing and that there may be other aliases with the same RW capability.
val
= Value. (R, can have R aliases, can be passed) : A val
reference to data implies that all references to that data are read-only. You can safely use the data without worrying about concurrency problems.
tag
= Tag. (only allows sending of messages, can be passed) : A tag
variable references a place in memory that has no guarantees. The only thing you can do is send a message to it. Since it a tag
does not allow changing the internals directly it can be passed without problem.