Home Automation Protocols

I’ve been using the xPL Protocol since 2005, prior to that I was using Jabber (or XMPP as it is known now). As I’m thinking about changing protocol, I thought I’d write a bit about xPL and, in later posts, something about the candidates to replace it.

The xPL Protocol has three types of messages: commands (xpl-cmnd), triggers (xpl-trig) and status (xpl-stat). They have the same simple format consisting of the message type, common “header” fields, the schema type and schema-specific “body” fields. For example:

xpl-cmnd
{
hop=1
source=vendor-device.host
target=*
}
x10.basic
{
command=on
device=c3
}

xPL doesn’t use a server as its “message bus” is simply UDP port 3865 on the local network broadcast address. In order to avoid multiple devices having to listen on the same port on the same broadcast address, each host runs a hub that distributes messages to all local xPL clients. I think this is a poor solution to this problem. It introduces an unnecessary single point of failure (albeit one with very simple-to-test behaviour). To avoid this single point of failure, my xPL clients also support a hubless mode where they simply set the SO_REUSEADDR socket option on the listen socket so that all clients can coexist on the same port on the same broadcast address. However, if you need to interoperate with clients from other developers then you’ll need to use the standard hub model.

Using UDP broadcasts to avoid any (unnecessary) single points of failure is a good idea. However, I have more than 20 xPL clients on my network and all clients process all messages they receive, discarding most of them and responding to relatively few. This means that all of the clients have to wake up and consume CPU resource for every single message. (This is one of the issues that is prompting me to re-evaluate my choice of protocol.)

I thought about avoiding this problem by combining several clients in to one more complex client but this seems like a bad idea as it will inevitably lead to more complexity and thus more complicated failure cases. (My xPL clients are structured in such a way that this is very easy to do and some users with lower-powered servers do this.)

For me, the xPL Message Schema are what make the protocol work so well. Essentially, the schema define the fields to expect in the body of an xPL message. This means I can write a client for a UDIN USB Relay Controller or a Phaedrus VIOM that supports the control.basic schema and someone else can write a client that uses the same schema to control relays and it will happily interoperate with my clients. (In fact, I realised this advantage when I migrated from using a VIOM to several more compact UDIN controllers. I only needed to change the names of the relays for open/close on my blinds/curtains after each set of wires were migrated to the new hardware.)

The schema, as described in the protocol documentation, support a type of inheritance. A message schema that inherits from another must include all of the mandatory fields of the other, in the same order, so clients can support the basic function even if they don’t understand the additional sub-schema fields. However, since optional vendor-specific fields are also allowed without inheritance (giving a simple duck typing approach), the inheritance mechanism doesn’t really give much benefit.

Take hbeat messages, there are two types, basic:

xpl-stat
{
hop=1
source=vendor-device.host
target=*
}
hbeat.basic
{
interval=5
}

and, inherited from basic, app:

xpl-stat
{
hop=1
source=vendor-device.host
target=*
}
hbeat.app
{
interval=5
port=43438
remote-ip=198.51.100.2
}

Why do the messages need the .basic or .app? So that a client knows to expect the port and remote-ip fields? Why does a client need to know to expect them? A client can just parse the message and if the fields are present - i.e. no } following the interval=5 line - then it can treat the message like an .app message and like a .basic message if not. It is simply redundant.

The only obviously valid use of the .suffix is .confirm suffix but this would probably be better implemented as a fourth message type such as xpl-ack or xpl-conf.

The wildcard target=* line is also redundant since a lack of explicit target=vendor-device.host could just as easily be used to distinguish the wildcard case.

These superfluous elements mean that the “Lite on the wire, by design” subtitle on the xPL Protocol specification isn’t really as “lite” as it could be (and IMHO should be).

Another confusing aspect of the xPL Protocol is the notion of a device. In the protocol spec, for instance in the Device Groups section, it is clear that the target field is addressing an individual device - a controller for a single curtain/drape. This is fine in theory, but in practice, individual devices are general addressed using fields in the body. For instance, control.basic, remote.basic, sensor.basic, x10.basic, and x10.security use device, homeasy.basic uses the combination of address and unit, zwave.basic uses node, etc. This is because xPL clients are typically controllers/gateways for several devices. For instance, I have a 1-wire sensor/relay network with dozens of devices attached to it but one xPL client opens the 1-wire USB interface and manages all of these devices based on the device field in control.basic and sensor.basic messages.

The protocol would be simpler to use if addressing of devices was more consistent. In fact, devices (rather than clients) should be first-class citizens on the “network”. That is, they should be discoverable (like clients are today with hbeat requests) and probably should be aliasable (so that they can be given memorable names).

I like the xPL Protocol; it has been my protocol of choice for more than 5 years and most of these issues are not a big deal. The broadcast message bus is simple (if you ignore hubs) and reliable but the inefficiency of waking all clients on every message is starting to become an issue for me. So, in the next few posts, I’ll write about some of the alternatives I am considering.