This chapter talks about how BEEP sessions get lawyered-up. The term "lawyered-up" is just an expression
that some of us use for adding security to a plain vanilla BEEP session. When
a BEEP session starts out, you get whatever security properties are provided
by the underlying transport service. In most cases, this means that your
traffic is unencrypted and unauthenticated. Now, maybe that's okay for your
environment, and if that's the case, make sure that you're dosing properly
with your meds. If not, then BEEP gives you a way of fixing that. It's called
"tuning," which is the official term for the process of giving a newly-created
BEEP session the security properties you want, as shown in Figure
Figure 3-1. The tuning precept
In BEEP, sessions are tuned for two things:
- Transport privacy
- User authentication
Sometimes you can accomplish both of these simultaneously; in
other cases, you have to take care of privacy before authentication. It's all
a function of the security technologies you have available. Don't worry, we'll
explain the details later on. The one thing you must understand is that BEEP's
view of security is entirely protocol-centric--you're still responsible for
what happens to the data before and after it gets sent. (In other words,
tuning doesn't help with sloppy coding such as buffer overflows.)
Before we talk about the details of tuning, we have to talk a
little bit about two related topics:
- How BEEP peers greet each other at the start of a
- How channels are managed
In BEEP, as soon as a session is started, both peers send a greeting, as shown in Figure
Figure 3-2. The greeting precept
The purpose of the greeting is three-fold. It allows each peer
- Advertise the profiles it supports
- Specify the preferred languages for diagnostics
- Indicate which optional features it supports
We're not going to talk about BEEP's optional features (there
haven't been any standardized); instead, let's look at the other two in turn.
If you recall from , the way a channel gets created is:
- One peer makes a request with one or more possible
- The other peer responds by saying which profile it's
going to use for the channel.
So, one of the things that a greeting is good for is to let the
other peer know what the possible choices are, right?
It turns out that, depending on the kind of peer you're talking
to, the greeting you see is going to have a different kind of "feel." If
you're using a sophisticated API for BEEP, you probably don't need to
appreciate the different feel for the greeting, but it's still worthwhile to
Let's start with the simplest case. If you're looking at a
traditional client/server scenario, then--independent of what's in the
server's greeting--the client's greeting is probably going to look like this:
It doesn't get any simpler than this--the greeting is empty.
When you see this, it's telling you not to bother trying to
start any profiles--the peer who sent the empty greeting will be the one who
decides what gets started and when. If you're just a server, this is just the
way you want it. However, as a server, what's your
greeting going to look like?
The first choice is whether you require
that the session be tuned prior to doing any real work. If so, then you send a
greeting that contains only tuning profiles, such as:
<profile uri='http://iana.org/beep/TLS' />
Think of this as saying "I'm not saying anything more until I
If you've been paying attention, you should have a very
important question at this point: if only tuning profiles are in the greeting,
how does any real work get done? The answer is that after a tuning profile
makes a session "secure" (i.e., it starts encrypting traffic so third parties
can't see what's going on), both peers send another greeting. We'll explain
why later, in the section "The TLS
Profile," but for now, just keep in mind that tuning a session may result
in another greeting.
Of course, the third kind of greeting you might see has both
types of profiles:
<profile uri='http://iana.org/beep/TLS' />
<profile uri='http://iana.org/beep/syslog/COOKED' />
This leaves the choice of tuning up to the client.
Hopefully, your API will handle all of these details for you,
letting you specify the tuning policy you want (e.g., "privacy first"), and
then transparently handling the greeting mechanics for you. If not, you
hopefully now understand the way it works.
Sending diagnostic information in English isn't universally
helpful. In the golden age of application protocol design, error messages
contained two parts:
- A three-digit reply code
- A textual diagnostic
History has shown the combination of a machine-readable reply
code with human-readable text to be a good choice. (See the section
"Reporting" in the Appendix.) A reply code consists of three digits:
- Completion (the first digit)
- Explains whether the request succeeded, failed, or
didn't complete, and is one of:
- Positive preliminary (1)
- The request is ready to be performed, pending
further confirmation or rejection.
- Positive completion (2)
- The request has succeeded.
- Positive intermediate (3)
- The request is ready to be performed, pending
- Transient negative (4)
- The request wasn't performed, but if retried
later, it may very well succeed.
- Permanent negative (5)
- The request wasn't performed, and some explicit
action must be taken before it could ever succeed.
- Category (the second digit)
- Explains why the request succeeded, failed, or didn't
complete, and is one of:
- Syntax (0)
- The reply deals with syntax issues, such as errors
in syntax, or unrecognized commands.
- Informational (1)
- The reply contains useful information.
- Connection (2)
- The reply deals with the session or transport
- Security (3)
- The reply deals with the security subsystem.
- Application-specific (5)
- The reply deals with the application itself, e.g.,
something specific to the BEEP profile that generated the reply code.
- Instance (the third digit)
- Distinguishes between different situations having the
same completion and category values.
Although the application protocol designer is responsible for
indicating what reply code gets used in each situation, most programs need to
be able to make decisions based on the first digit only.
BEEP uses the code/diagnostic pair whenever it needs to convey
an error. For example, in Chapter 2 when we talked about creating a channel,
you might recall that the server peer either replies with the identity of the
profile that is going to be used on the channel, or it refuses and signals an
Here's an example:
<error code='500'>none of the profiles are supported</error>
However, there is this little matter of picking the natural
language to use for the text. Historically, the choice has been English (or
rather, "geeklish"). More recently, it has been growing more common to allow
each peer to advertise its preferences, e.g.:
<greeting localize='en-US fr-CA'>
which asks for the U.S. variant of English, and, if that's not
possible, Canadian French.
The only real question is where "language tags" such as
en-US come from. The answer is that BEEP refers the
reader to RFC 3066, which in turn refers the reader to ISO standards 639 and
3166. In practice, the rules are pretty simple:
- Start with the two-letter abbreviation for the language
from part one of ISO 639.
- Append a hyphen and the two-letter abbreviation for the
country from ISO 3166.
There's actually a lot more flexibility than that, and if you
use it, I have every confidence that you'll get exactly what you deserve.
In BEEP, as soon as a session is started, both peers send a
greeting. But how can a greeting be sent if there aren't any channels to send
The answer is that a newly-created BEEP session always comes
with one channel, channel zero, already created. Channel zero's sole role is
channel management, which means three things:
- Creating new channels
- Destroying existing channels
- Releasing the entire session
These are shown in Figure
3-3; let's look at each in turn.
Figure 3-3. The channel zero precept
Earlier, in the section , we talked about BEEP's "suggest many,
accept one" philosophy and how this was used, among other things, for channel
To recap, after they exchange greetings, when one peer wants to
open a channel it might suggest:
<profile uri='http://iana.org/beep/SASL/DIGEST-MD5' />
<profile uri='http://iana.org/beep/SASL/OTP' />
and, if the other peer decides to start channel number 1, it
will indicate which of these two profiles it selected.
It turns out that there were two nuances that we left out
- Piggybacking initial data
- Requesting a "virtual host"
BEEP provides a latency-reduction mechanism that lets you create
a channel and perform its first exchange at the same time.
The basic idea is to remove one round-trip time from the
process. Instead of having to wait a round-trip to find out if the channel
creation is successful before performing the first exchange, the exchange gets
"piggybacked" on the messages that perform the channel creation.
Here's how it works: when a channel is started, both peers can
include a string of octets intended for the channel. When you try to start
channel, you can include your first message; if the channel is created, your
peer processes the message and includes the corresponding reply when you're
told that the channel is successfully created.
It's fairly common in today's Internet for a physical server to
be known by several logical host names. In HTTP 1.1, the client signals this
by including the
Host: header in its request.
In BEEP, this is done using the
serverName attribute for the first successful channel
<start number='1' serverName='mosquiton.example.com'>
<profile uri='http://iana.org/beep/TLS' />
If the channel isn't created, then a different
serverName value may be used on the next request. Once a
channel with a
serverName is created, any
serverName attributes used to create future channels are
The use of the
is particularly important in tuning, not only because of the "first success"
rule, but because the peer you're talking to may have different certificate
and authorization databases for each of its virtual hosts. How do you know
what value to use?
The answer depends on context. If your program is dereferencing
a URL that maps onto a service that uses BEEP, the answer is self-evident
(e.g., soap.beep://mosquiton.example.com/). If your
program isn't URL-driven, but you started with a fully qualified domain name,
just use that. If not, then--in the absence of some other information--don't
serverName attribute at all.
After you create and presumably use a channel, BEEP lets you
Most BEEP usage is of the form:
- Start a session by establishing the underlying
- Perhaps tune the session (using one channel).
- Create and use one, or maybe two, channels for
- Release the session (which implicitly closes all
This makes it hard to understand why anyone would bother closing
The reasoning is rather subtle--in some usage scenarios, you may
have very long-lived sessions where you want to close a channel prior to a
period of inactivity. By doing so, you free whatever application-specific
resources are being used by that channel. Of course, only certain kind of
applications need this kind of behavior; for those that don't, simply
releasing the session does the trick. (In other words, this is an example of
BEEP letting you decide exactly what you want to get and pay for.)
You release the session by explicitly closing channel zero.
This brings up the one fun part about closing a channel: it
involves a round-trip negotiation. What this means is that if one peer is
still busy working on something, it can come back and say "no." Of course, the
peer that wants the session to go away now can always just drop the underlying
transport connection. In this way, BEEP gives you the tools you need to avoid
any ambiguity as to whether both sides are ready to close, but in an
emergency, you can just blow the bolts.
Now that we've talked about greetings and channel management, we
can get to the actual tuning. Let's first talk about transport security and
The TLS Profile
TLS is the IETF's version of version 3 of SSL. For our purposes,
Transport Layer Security (TLS, RFC 2246) provides:
- Certificate-based authentication of one or both peers
- Cryptographic protection against passive eavesdropping
- Cryptographic detection of alteration, duplication, and
reordering of traffic
The way TLS does this is outside the scope of this book. If you
really want to know how it all works, get a copy of Eric Rescorla's seminal
reference SSL and TLS: Designing and Building Secure
However, the key thing to understand about TLS is that the
cryptographic certificates and algorithms that it uses are both configurable.
Security people delight in unseemly and incomprehensible fights as to what
kind of algorithms and key lengths should be used; as a BEEP person, you just
don't care--look at the documentation for the API for BEEP that you're using,
and it should tell you how to find out what's available in the TLS tuning
profile it uses.
To use TLS with BEEP, you start a channel with the profile
identified as http://iana.org/beep/TLS. Once you've
started the channel, the TLS negotiation process begins when you send a
message to the other peer.
Recall from an earlier example that you can use the
serverName attribute to signal the other peer as to the
credentials you're looking for:
<start number='1' serverName='mosquiton.example.com'>
<profile uri='http://iana.org/beep/TLS' />
The only tricky thing to understand about using the TLS profile
(or any tuning profile that does transport security) is what happens
immediately before and after the underlying negotiation process.
- Channel zero is reset and all other channels are
- Both peers send a greeting, regardless of whether the
negotiation was successfully completed or not.
This is called a "tuning reset."
There are two reasons why BEEP has the concept of a tuning
reset: the first is for practicality; the second is for correctness.
First, using a transport security profile inserts a new layer
immediately between BEEP and the underlying transport service. You don't want
any other BEEP messages unexpectedly showing up; it would be a nightmare
trying to straighten it all out. So, just before the TLS engine is invoked to
do its voodoo, all channels are closed.
Second, until the session is made tamper-evident, it's possible
for someone to alter BEEP's messages in transit. When a tuning reset occurs,
both peers reset all state from the session; this means that the first thing
that both sides do is send a new greeting.
The SASL Family of Profiles
SASL is the best thing to happen to application protocols since
the reply code.
Unlike TLS, the Simple Authentication and
Security Layer (SASL, RFC 2222) isn't a protocol. Instead, SASL is a
framework like BEEP. SASL's goal is to provide a set of rules that allow
application protocols to support multiple security mechanisms.
Earlier, back in , we saw why SASL came about. Basically, an
application's security requirements may be different, depending on where it's
provisioned, and may change over time, even in the same environment. Further,
security technologies have different price-points for strength, scalability,
and ease of deployment.
The practical upshot of this is that we need a flexible way to
accommodate different security technologies. SASL defines a set of rules for
how security technologies have their data carried by an application protocol.
If you're a security engineer, and you follow SASL's rules, your technology is
called a SASL mechanism and it plugs into any
SASL-capable application protocol. This is the genius of SASL: it defines one
generic hook that accommodates a wide range of different mechanisms.
At a minimum, each SASL mechanism provides user authentication.
Of course, the "strength" of that authentication is dependent on the
algorithms used by the mechanism. Most SASL mechanisms allow you to convey two
- An authentication identity, which tells who you are
- An authorization identity, which tells who you're
acting on behalf of (if you're a proxy)
Some of the mechanisms and their attributes are shown in Figure
Figure 3-4. Some SASL mechanism precepts
The Internet Assigned Numbers Authority
(IANA) maintains a registry of SASL mechanisms. You can find the list at the
IANA's web site (http://www.iana.org/).
Although there are a lot of choices, there are really only six of interest:
- This logs so-called "trace" information. It's not
authenticated, just informational--like when you provide your email address
to an anonymous FTP server. If you're interested in the details, see RFC
- This is used when you've already encrypted at the
transport layer, and you want to send the traditional username and password.
This mechanism provides an upgrade path for systems that use a one-way
function to store their passwords. For more information, see RFC 2595.
- This is the dual of the PLAIN mechanism--it uses a
lightweight challenge/response over a plaintext session to a server that
stores passwords in plaintext form. This mechanism provides an upgrade path
for systems that store their passwords in the clear. See RFC 2195 for more
- A replacement for the CRAM-MD5 mechanism, which
avoids a serious security weakness. This mechanism also provides mutual
authentication and is highly scalable for busy servers. If you want to know
more, see RFC 2831.
- This uses a one-time password (suitable for use at
untrusted devices such as kiosks), in which the server can one-time
authenticate the user without knowing the user's password. Further, at the
outcome of a successful authentication, the client can incrementally modify
(i.e., update) its passphrase. RFC 2444, RFC 2289, and RFC 2243 have all the
- This is used when you've already authenticated at the
network or transport layer, and you just want to tell the server what
authorization identity you'd like to use.
Of course, there are many other SASL mechanisms, and some may be
available to you. For example, there's a SASL mechanism for version 4 of
Kerberos (see RFC 2222). Similarly, if your organization uses SecurID®,
there's a SASL mechanism for it too (see RFC 2808). To put this into greater
context, Chris Newman has developed an informal taxonomy of SASL mechanisms,
which, with his permission, I've condensed into Figure
Figure 3-5. A SASL taxonomy
So, it should now be clear why we always say "the SASL family of
profiles"--every time someone registers a SASL mechanism (e.g.,
XXX) a corresponding tuning profile is automatically defined, e.g., http://iana.org/beep/SASL/XXX.
In addition, some SASL mechanisms also provide a security layer,
which makes the session tamper-evident, and may also provide privacy. In the
latter case, the SASL mechanism provides the same kind of functionality that
TLS does. DIGEST-MD5 is an example of a mechanism that does both the "SA" part
of SASL (simple authentication) and (optionally) the "SL" part (security
Finally, it's likely that the SASL specification (RFC 2222) will
be revised in calendar year 2002. If so, although some of the details may
change, no changes should be necessary from the application
Tuning in Practice
Tuning is a lot simpler in practice than in theory. Let's go
straight to "ideal" practice:
- See if the underlying transport or network service is
already authenticated and encrypted; if so, tune using the SASL EXTERNAL
profile, and you're done.
- Otherwise, decide whether you want encryption. If you
do, tune using a profile that does transport privacy.
- Then, decide whether you want authentication. If you
- If you already tuned for transport privacy, and if
authentication took place, then tune using the SASL EXTERNAL profile.
- Otherwise, tune using a profile that does user
Note that you don't have to tune at all. If your application
doesn't need to be provisioned for security, then the first channel you start
is an exchange profile to do useful work.
BEEP defines a lot of different tuning profiles, and they each
have their own sweet spot. So, what tuning profiles should you use? It
depends, of course, on what your requirements are. Having said that, here's
what the reliable syslog specification (RFC 3195)
- If you want user authentication, tune with the SASL
DIGEST-MD5 profile for authentication only.
- If you also want tamper-detection, tune with the SASL
DIGEST-MD5 profile for both authentication and integrity protection.
- Otherwise, if you want privacy, tune with the TLS
The reason comes down to scaling: tuning with DIGEST-MD5 has a
lot less overhead than using TLS, but TLS supports stronger encryption
This policy is probably a pretty good middle ground. Of course,
a security maven will tell you that there's no such thing. They're right that
an application operating in a given environment has its own set of unique
requirements, but, in practice, this level of granularity is largely
irrelevant (unless you have the term "sigint" in your job description).
But, what if your server is sitting on top of a legacy password
database? In that case, you can't use the SASL DIGEST-MD5 profile, and you're
not going to get your users to install client-side certificates, so you can't
tune using the TLS and EXTERNAL profiles.
This isn't a problem; here's the "legacy" practice: tune using
the TLS profile (only the server need authenticate itself), and then tune
using the SASL PLAIN profile.
The only trick here is to make sure that your server advertises
the SASL PLAIN profile only after transport privacy is in effect.
Tuning Profiles Versus Exchange
Finally, what's the real difference between a profile used for
tuning and one used for exchange? There are two differences: one of which is a
rule, the other a convention.
First, as a rule, BEEP demands that once you create a channel
with a tuning profile, you can't create another tuning channel until you
finish with the first one. This is because tuning channels muck around with
the global properties of a BEEP session, and it's too confusing for most
implementations to keep track of more than one. Actually, the rules are even
stricter--BEEP allows you to authenticate at most once during a session;
similarly, once you turn on transport privacy, there's no turning it off or
negotiating something else. In contrast, you can have more than one channel
created with an exchange profile running at the same time. In fact, you can
even have multiple channels bound to the same exchange profile.
Second, as a convention, first you tune, then you exchange. It
doesn't make a lot of sense to intermix the activities of the two. (If you can
think of a scenario in which it would make sense, drop me a note!)
Beyond these two differences, there aren't any more: anyone is
free to define as many profiles as they want, and they can be profiles used
for tuning or data exchange. Of course, between TLS and the SASL family of
mechanisms, the BEEP folks think the bases are covered, but there are other
things you can do with tuning. (For an example, take a look at .)
The Lifecycle of a Session
To sum all of this up, let's take a look at a "typical" session
as shown in Figure
Figure 3-6. The lifecycle of a "typical" session
Consider the typical session shown here:
- It begins when a transport connection is established,
which creates an untuned BEEP session along with channel zero.
- The first thing that happens on the session is the
exchange of greetings between the peers on channel zero.
- After this, a channel bound to the TLS profile is
started, which ultimately results in a tuning reset, implicitly closing both
- Assuming the underlying TLS negotiation is successful,
the session is now tuned for privacy. Regardless, we have a new channel
zero, and the usual exchange of greetings.
- Next, one of the peers starts a channel bound to the
SASL PLAIN profile, and authenticates itself. Assuming the authentication is
successful, the session is now tuned for authentication (in one direction).
Further, once the authentication is complete, this channel could be closed,
but it's not necessary.
- Next, a channel bound to the SOAP profile is started,
and a SOAP message exchange is begun.
- This exchange seems to be taking a while, so another
channel is started, and, for the rest of the session, SOAP messages are
exchanged over both of them. Note that although the messages exchanged on
each channel are processed serially, the two channels are running
independently of each other.
- Finally, when we're ready to wind things up, channel
zero is used to release the session, implicitly closing all channels.