Basic Search  Advanced Search   
Topics Resources Free Library Software XML News About Us
  You are here: home »» Free Library »» O'Reilly Books » BEEP: The Definitive Guide Saturday, 3 November 2007
BEEP: The Definitive Guide

ISBN: 0596002440
Author(s): Marshall T. Rose
March 2002

BEEP gives network developers what they’ve long needed: a standard toolkit for building protocols quickly and conveniently. Written by BEEP’s creator, this book demonstrates how to use the BEEP implementation in Java, C, and Tcl. You’ll learn to build several working applications that use BEEP as a transport, including an implementation of the reliable SYSLOG protocol and an implementation of a BEEP transport for SOAP.

Buy this book!

Copyright O'Reilly & Associates, Inc.. Used with permission.

Chapter 3


This chapter talks about how BEEP sessions get lawyered-up. The term "lawyered-up" is just an expression that some of us use for adding security to a plain vanilla BEEP session. When a BEEP session starts out, you get whatever security properties are provided by the underlying transport service. In most cases, this means that your traffic is unencrypted and unauthenticated. Now, maybe that's okay for your environment, and if that's the case, make sure that you're dosing properly with your meds. If not, then BEEP gives you a way of fixing that. It's called "tuning," which is the official term for the process of giving a newly-created BEEP session the security properties you want, as shown in Figure 3-1.

Figure 3-1. The tuning precept


In BEEP, sessions are tuned for two things:

  • Transport privacy

  • User authentication

Sometimes you can accomplish both of these simultaneously; in other cases, you have to take care of privacy before authentication. It's all a function of the security technologies you have available. Don't worry, we'll explain the details later on. The one thing you must understand is that BEEP's view of security is entirely protocol-centric--you're still responsible for what happens to the data before and after it gets sent. (In other words, tuning doesn't help with sloppy coding such as buffer overflows.)

Before we talk about the details of tuning, we have to talk a little bit about two related topics:

  • How BEEP peers greet each other at the start of a session

  • How channels are managed

The Greeting

In BEEP, as soon as a session is started, both peers send a greeting, as shown in Figure 3-2.

Figure 3-2. The greeting precept


The purpose of the greeting is three-fold. It allows each peer to:

  • Advertise the profiles it supports

  • Specify the preferred languages for diagnostics

  • Indicate which optional features it supports

We're not going to talk about BEEP's optional features (there haven't been any standardized); instead, let's look at the other two in turn.

Supported Profiles

If you recall from , the way a channel gets created is:

  • One peer makes a request with one or more possible profiles.

  • The other peer responds by saying which profile it's going to use for the channel.

So, one of the things that a greeting is good for is to let the other peer know what the possible choices are, right?

It turns out that, depending on the kind of peer you're talking to, the greeting you see is going to have a different kind of "feel." If you're using a sophisticated API for BEEP, you probably don't need to appreciate the different feel for the greeting, but it's still worthwhile to explain it.

Let's start with the simplest case. If you're looking at a traditional client/server scenario, then--independent of what's in the server's greeting--the client's greeting is probably going to look like this:

<greeting />

It doesn't get any simpler than this--the greeting is empty.

When you see this, it's telling you not to bother trying to start any profiles--the peer who sent the empty greeting will be the one who decides what gets started and when. If you're just a server, this is just the way you want it. However, as a server, what's your greeting going to look like?

The first choice is whether you require that the session be tuned prior to doing any real work. If so, then you send a greeting that contains only tuning profiles, such as:

    <profile uri='' />

Think of this as saying "I'm not saying anything more until I lawyer-up."

If you've been paying attention, you should have a very important question at this point: if only tuning profiles are in the greeting, how does any real work get done? The answer is that after a tuning profile makes a session "secure" (i.e., it starts encrypting traffic so third parties can't see what's going on), both peers send another greeting. We'll explain why later, in the section "The TLS Profile," but for now, just keep in mind that tuning a session may result in another greeting.

Of course, the third kind of greeting you might see has both types of profiles:

    <profile uri='' />
    <profile uri='' />

This leaves the choice of tuning up to the client.

Hopefully, your API will handle all of these details for you, letting you specify the tuning policy you want (e.g., "privacy first"), and then transparently handling the greeting mechanics for you. If not, you hopefully now understand the way it works.

Localization (L10N)

Sending diagnostic information in English isn't universally helpful. In the golden age of application protocol design, error messages contained two parts:

  • A three-digit reply code

  • A textual diagnostic

History has shown the combination of a machine-readable reply code with human-readable text to be a good choice. (See the section "Reporting" in the Appendix.) A reply code consists of three digits:

Completion (the first digit)
Explains whether the request succeeded, failed, or didn't complete, and is one of:

Positive preliminary (1)
The request is ready to be performed, pending further confirmation or rejection.

Positive completion (2)
The request has succeeded.

Positive intermediate (3)
The request is ready to be performed, pending further data.

Transient negative (4)
The request wasn't performed, but if retried later, it may very well succeed.

Permanent negative (5)
The request wasn't performed, and some explicit action must be taken before it could ever succeed.

Category (the second digit)
Explains why the request succeeded, failed, or didn't complete, and is one of:

Syntax (0)
The reply deals with syntax issues, such as errors in syntax, or unrecognized commands.

Informational (1)
The reply contains useful information.

Connection (2)
The reply deals with the session or transport connection.

Security (3)
The reply deals with the security subsystem.

Application-specific (5)
The reply deals with the application itself, e.g., something specific to the BEEP profile that generated the reply code.

Instance (the third digit)
Distinguishes between different situations having the same completion and category values.

Although the application protocol designer is responsible for indicating what reply code gets used in each situation, most programs need to be able to make decisions based on the first digit only.

BEEP uses the code/diagnostic pair whenever it needs to convey an error. For example, in Chapter 2 when we talked about creating a channel, you might recall that the server peer either replies with the identity of the profile that is going to be used on the channel, or it refuses and signals an error.

Here's an example:

<error code='500'>none of the profiles are supported</error>

However, there is this little matter of picking the natural language to use for the text. Historically, the choice has been English (or rather, "geeklish"). More recently, it has been growing more common to allow each peer to advertise its preferences, e.g.:

<greeting localize='en-US fr-CA'>

which asks for the U.S. variant of English, and, if that's not possible, Canadian French.

The only real question is where "language tags" such as en-US come from. The answer is that BEEP refers the reader to RFC 3066, which in turn refers the reader to ISO standards 639 and 3166. In practice, the rules are pretty simple:

  • Start with the two-letter abbreviation for the language from part one of ISO 639.

  • Append a hyphen and the two-letter abbreviation for the country from ISO 3166.

There's actually a lot more flexibility than that, and if you use it, I have every confidence that you'll get exactly what you deserve.

Channel Management

In BEEP, as soon as a session is started, both peers send a greeting. But how can a greeting be sent if there aren't any channels to send it on?

The answer is that a newly-created BEEP session always comes with one channel, channel zero, already created. Channel zero's sole role is channel management, which means three things:

  • Creating new channels

  • Destroying existing channels

  • Releasing the entire session

These are shown in Figure 3-3; let's look at each in turn.

Figure 3-3. The channel zero precept


Channel Creation

Earlier, in the section , we talked about BEEP's "suggest many, accept one" philosophy and how this was used, among other things, for channel creation.

To recap, after they exchange greetings, when one peer wants to open a channel it might suggest:

<start number='1'>
    <profile uri='' />
    <profile uri='' />

and, if the other peer decides to start channel number 1, it will indicate which of these two profiles it selected.

It turns out that there were two nuances that we left out earlier:

  • Piggybacking initial data

  • Requesting a "virtual host"

The piggyback

BEEP provides a latency-reduction mechanism that lets you create a channel and perform its first exchange at the same time.

The basic idea is to remove one round-trip time from the process. Instead of having to wait a round-trip to find out if the channel creation is successful before performing the first exchange, the exchange gets "piggybacked" on the messages that perform the channel creation.

Here's how it works: when a channel is started, both peers can include a string of octets intended for the channel. When you try to start channel, you can include your first message; if the channel is created, your peer processes the message and includes the corresponding reply when you're told that the channel is successfully created.

Virtual hosting

It's fairly common in today's Internet for a physical server to be known by several logical host names. In HTTP 1.1, the client signals this by including the Host: header in its request.

In BEEP, this is done using the serverName attribute for the first successful channel creation, e.g.:

<start number='1' serverName=''>
    <profile uri='' />

If the channel isn't created, then a different serverName value may be used on the next request. Once a channel with a serverName is created, any serverName attributes used to create future channels are ignored.

The use of the serverName attribute is particularly important in tuning, not only because of the "first success" rule, but because the peer you're talking to may have different certificate and authorization databases for each of its virtual hosts. How do you know what value to use?

The answer depends on context. If your program is dereferencing a URL that maps onto a service that uses BEEP, the answer is self-evident (e.g., soap.beep:// If your program isn't URL-driven, but you started with a fully qualified domain name, just use that. If not, then--in the absence of some other information--don't include a serverName attribute at all.

Channel Destruction

After you create and presumably use a channel, BEEP lets you close it.

Most BEEP usage is of the form:

  1. Start a session by establishing the underlying transport connection.

  2. Perhaps tune the session (using one channel).

  3. Create and use one, or maybe two, channels for exchange.

  4. Release the session (which implicitly closes all channels).

This makes it hard to understand why anyone would bother closing channels explicitly.

The reasoning is rather subtle--in some usage scenarios, you may have very long-lived sessions where you want to close a channel prior to a period of inactivity. By doing so, you free whatever application-specific resources are being used by that channel. Of course, only certain kind of applications need this kind of behavior; for those that don't, simply releasing the session does the trick. (In other words, this is an example of BEEP letting you decide exactly what you want to get and pay for.)

Session Release

You release the session by explicitly closing channel zero.

This brings up the one fun part about closing a channel: it involves a round-trip negotiation. What this means is that if one peer is still busy working on something, it can come back and say "no." Of course, the peer that wants the session to go away now can always just drop the underlying transport connection. In this way, BEEP gives you the tools you need to avoid any ambiguity as to whether both sides are ready to close, but in an emergency, you can just blow the bolts.

Now that we've talked about greetings and channel management, we can get to the actual tuning. Let's first talk about transport security and user authentication.

The TLS Profile

TLS is the IETF's version of version 3 of SSL. For our purposes, Transport Layer Security (TLS, RFC 2246) provides:

  • Certificate-based authentication of one or both peers

  • Cryptographic protection against passive eavesdropping

  • Cryptographic detection of alteration, duplication, and reordering of traffic

The way TLS does this is outside the scope of this book. If you really want to know how it all works, get a copy of Eric Rescorla's seminal reference SSL and TLS: Designing and Building Secure Systems.

However, the key thing to understand about TLS is that the cryptographic certificates and algorithms that it uses are both configurable. Security people delight in unseemly and incomprehensible fights as to what kind of algorithms and key lengths should be used; as a BEEP person, you just don't care--look at the documentation for the API for BEEP that you're using, and it should tell you how to find out what's available in the TLS tuning profile it uses.

To use TLS with BEEP, you start a channel with the profile identified as Once you've started the channel, the TLS negotiation process begins when you send a message to the other peer.

Recall from an earlier example that you can use the serverName attribute to signal the other peer as to the credentials you're looking for:

<start number='1' serverName=''>
    <profile uri='' />

The only tricky thing to understand about using the TLS profile (or any tuning profile that does transport security) is what happens immediately before and after the underlying negotiation process.

Channel zero is reset and all other channels are closed.

Both peers send a greeting, regardless of whether the negotiation was successfully completed or not.

This is called a "tuning reset."

There are two reasons why BEEP has the concept of a tuning reset: the first is for practicality; the second is for correctness.

First, using a transport security profile inserts a new layer immediately between BEEP and the underlying transport service. You don't want any other BEEP messages unexpectedly showing up; it would be a nightmare trying to straighten it all out. So, just before the TLS engine is invoked to do its voodoo, all channels are closed.

Second, until the session is made tamper-evident, it's possible for someone to alter BEEP's messages in transit. When a tuning reset occurs, both peers reset all state from the session; this means that the first thing that both sides do is send a new greeting.

The SASL Family of Profiles

SASL is the best thing to happen to application protocols since the reply code.

Unlike TLS, the Simple Authentication and Security Layer (SASL, RFC 2222) isn't a protocol. Instead, SASL is a framework like BEEP. SASL's goal is to provide a set of rules that allow application protocols to support multiple security mechanisms.

Earlier, back in , we saw why SASL came about. Basically, an application's security requirements may be different, depending on where it's provisioned, and may change over time, even in the same environment. Further, security technologies have different price-points for strength, scalability, and ease of deployment.

The practical upshot of this is that we need a flexible way to accommodate different security technologies. SASL defines a set of rules for how security technologies have their data carried by an application protocol. If you're a security engineer, and you follow SASL's rules, your technology is called a SASL mechanism and it plugs into any SASL-capable application protocol. This is the genius of SASL: it defines one generic hook that accommodates a wide range of different mechanisms.

At a minimum, each SASL mechanism provides user authentication. Of course, the "strength" of that authentication is dependent on the algorithms used by the mechanism. Most SASL mechanisms allow you to convey two identities:

  • An authentication identity, which tells who you are

  • An authorization identity, which tells who you're acting on behalf of (if you're a proxy)

Some of the mechanisms and their attributes are shown in Figure 3-4.

Figure 3-4. Some SASL mechanism precepts


The Internet Assigned Numbers Authority (IANA) maintains a registry of SASL mechanisms. You can find the list at the IANA's web site ( Although there are a lot of choices, there are really only six of interest:

This logs so-called "trace" information. It's not authenticated, just informational--like when you provide your email address to an anonymous FTP server. If you're interested in the details, see RFC 2245.

This is used when you've already encrypted at the transport layer, and you want to send the traditional username and password. This mechanism provides an upgrade path for systems that use a one-way function to store their passwords. For more information, see RFC 2595.

This is the dual of the PLAIN mechanism--it uses a lightweight challenge/response over a plaintext session to a server that stores passwords in plaintext form. This mechanism provides an upgrade path for systems that store their passwords in the clear. See RFC 2195 for more details.

A replacement for the CRAM-MD5 mechanism, which avoids a serious security weakness. This mechanism also provides mutual authentication and is highly scalable for busy servers. If you want to know more, see RFC 2831.

This uses a one-time password (suitable for use at untrusted devices such as kiosks), in which the server can one-time authenticate the user without knowing the user's password. Further, at the outcome of a successful authentication, the client can incrementally modify (i.e., update) its passphrase. RFC 2444, RFC 2289, and RFC 2243 have all the details.

This is used when you've already authenticated at the network or transport layer, and you just want to tell the server what authorization identity you'd like to use.

Of course, there are many other SASL mechanisms, and some may be available to you. For example, there's a SASL mechanism for version 4 of Kerberos (see RFC 2222). Similarly, if your organization uses SecurID®, there's a SASL mechanism for it too (see RFC 2808). To put this into greater context, Chris Newman has developed an informal taxonomy of SASL mechanisms, which, with his permission, I've condensed into Figure 3-5.[1]

Figure 3-5. A SASL taxonomy


So, it should now be clear why we always say "the SASL family of profiles"--every time someone registers a SASL mechanism (e.g., XXX) a corresponding tuning profile is automatically defined, e.g.,

In addition, some SASL mechanisms also provide a security layer, which makes the session tamper-evident, and may also provide privacy. In the latter case, the SASL mechanism provides the same kind of functionality that TLS does. DIGEST-MD5 is an example of a mechanism that does both the "SA" part of SASL (simple authentication) and (optionally) the "SL" part (security layer) too.

Finally, it's likely that the SASL specification (RFC 2222) will be revised in calendar year 2002. If so, although some of the details may change, no changes should be necessary from the application designer/programmer's perspective.

Tuning in Practice

Tuning is a lot simpler in practice than in theory. Let's go straight to "ideal" practice:

  1. See if the underlying transport or network service is already authenticated and encrypted; if so, tune using the SASL EXTERNAL profile, and you're done.

  2. Otherwise, decide whether you want encryption. If you do, tune using a profile that does transport privacy.

  3. Then, decide whether you want authentication. If you do:

  4. If you already tuned for transport privacy, and if authentication took place, then tune using the SASL EXTERNAL profile.

  5. Otherwise, tune using a profile that does user authentication.

Note that you don't have to tune at all. If your application doesn't need to be provisioned for security, then the first channel you start is an exchange profile to do useful work.

BEEP defines a lot of different tuning profiles, and they each have their own sweet spot. So, what tuning profiles should you use? It depends, of course, on what your requirements are. Having said that, here's what the reliable syslog specification (RFC 3195) says:

  • If you want user authentication, tune with the SASL DIGEST-MD5 profile for authentication only.

  • If you also want tamper-detection, tune with the SASL DIGEST-MD5 profile for both authentication and integrity protection.

  • Otherwise, if you want privacy, tune with the TLS profile.

The reason comes down to scaling: tuning with DIGEST-MD5 has a lot less overhead than using TLS, but TLS supports stronger encryption algorithms.

This policy is probably a pretty good middle ground. Of course, a security maven will tell you that there's no such thing. They're right that an application operating in a given environment has its own set of unique requirements, but, in practice, this level of granularity is largely irrelevant (unless you have the term "sigint" in your job description).

But, what if your server is sitting on top of a legacy password database? In that case, you can't use the SASL DIGEST-MD5 profile, and you're not going to get your users to install client-side certificates, so you can't tune using the TLS and EXTERNAL profiles.

This isn't a problem; here's the "legacy" practice: tune using the TLS profile (only the server need authenticate itself), and then tune using the SASL PLAIN profile.

The only trick here is to make sure that your server advertises the SASL PLAIN profile only after transport privacy is in effect.

Tuning Profiles Versus Exchange Profiles

Finally, what's the real difference between a profile used for tuning and one used for exchange? There are two differences: one of which is a rule, the other a convention.

First, as a rule, BEEP demands that once you create a channel with a tuning profile, you can't create another tuning channel until you finish with the first one. This is because tuning channels muck around with the global properties of a BEEP session, and it's too confusing for most implementations to keep track of more than one. Actually, the rules are even stricter--BEEP allows you to authenticate at most once during a session; similarly, once you turn on transport privacy, there's no turning it off or negotiating something else. In contrast, you can have more than one channel created with an exchange profile running at the same time. In fact, you can even have multiple channels bound to the same exchange profile.

Second, as a convention, first you tune, then you exchange. It doesn't make a lot of sense to intermix the activities of the two. (If you can think of a scenario in which it would make sense, drop me a note!)

Beyond these two differences, there aren't any more: anyone is free to define as many profiles as they want, and they can be profiles used for tuning or data exchange. Of course, between TLS and the SASL family of mechanisms, the BEEP folks think the bases are covered, but there are other things you can do with tuning. (For an example, take a look at .)

The Lifecycle of a Session

To sum all of this up, let's take a look at a "typical" session as shown in Figure 3-6.

Figure 3-6. The lifecycle of a "typical" session


Consider the typical session shown here:

  • It begins when a transport connection is established, which creates an untuned BEEP session along with channel zero.

  • The first thing that happens on the session is the exchange of greetings between the peers on channel zero.

  • After this, a channel bound to the TLS profile is started, which ultimately results in a tuning reset, implicitly closing both channels.

  • Assuming the underlying TLS negotiation is successful, the session is now tuned for privacy. Regardless, we have a new channel zero, and the usual exchange of greetings.

  • Next, one of the peers starts a channel bound to the SASL PLAIN profile, and authenticates itself. Assuming the authentication is successful, the session is now tuned for authentication (in one direction). Further, once the authentication is complete, this channel could be closed, but it's not necessary.

  • Next, a channel bound to the SOAP profile is started, and a SOAP message exchange is begun.

  • This exchange seems to be taking a while, so another channel is started, and, for the rest of the session, SOAP messages are exchanged over both of them. Note that although the messages exchanged on each channel are processed serially, the two channels are running independently of each other.

  • Finally, when we're ready to wind things up, channel zero is used to release the session, implicitly closing all channels.

1. Note to security gurus: apologies in advance if you start twitching uncontrollably when I place the terms "secure" and "best" in close proximity. Note to everyone else: any security guru will tell you that a table with a single column labeled "Secure" is vastly oversimplified.

  Contact Us | E-mail Us | Site Guide | About PerfectXML | Advertise ©2004 All rights reserved. | Privacy