Packetized Voice: a new communications concept for air ...



ICAO

Aeronautical Communications Panel

Working Group C

Montreal, 19-23 April 2004

Agenda Item 6 – New Technologies

Packetized Voice:

A New Communications Concept for Air Traffic Control

Presented by

Alvin H. Burgemeister

B-twelve Associates, Inc.

Connexion by Boeing

ICCAIA

Summary

This paper introduces a new concept in air traffic voice communications, which takes advantage of the inherent packet nature of push-to-talk two-way voice communications to significantly improve the efficiency of network bandwidth and provide numerous other potential benefits. Although described primarily for air/ground radio for air traffic control, the concept can be used wherever push-to-talk communications is appropriate.

Table of Contents

1. Executive Summary 1

2. Introduction 1

2.1. Current ATC Voice Radio 2

2.2. Basic Communication Concepts 2

2.3. Characteristics of ATC Voice Communications 3

2.4. Transit Delay 3

2.5. The Data Link Concept of Routing 4

3. Proposed Packet Voice Concept 4

3.1. Human Factors 4

3.1.1. Microphone and Headset/Speaker Interface 4

3.1.2. Address Selection 5

3.1.3. Annunciation of Support Data 5

3.1.4. Party Line Effect 5

3.2. Voice Compression and Decompression 6

3.3. Routing of Voice Packets 7

3.4. Channel Loading of Radio Networks 7

3.5. Other Uses for the Packet Voice Concept 7

4. Conclusion 8

Packetized Voice:

A New Communications Concept for Air Traffic Control

Executive Summary

Voice communications over two-way radio, such as is used for air traffic control, can be accurately described as a packet service. That is, each transmission by a pilot or controller consists of a single packet of information. The way that the pilot and controller gain access to the radio channel can be described as a collision-sense multiple-access (CSMA) protocol. The humans involved provide the protocol sensing and logic, which unfortunately adds to their workload, causes error, and is inefficient.

This paper describes a voice communications concept that makes use of the packet service nature of two-way voice communications, improves channel utilization, and reduces the error and workload experienced with the current system. This communications concept is described as a data communication application, which allows it to capture many of the positive aspects of data link, such as automatic routing over the best available air-ground link, and transparency of link selection to the pilot and controller.

Although the dialog between the flight crew and air traffic control is used to explain the concept, this concept is very applicable to other aeronautical voice communications, such between the pilots and dispatch, pilots and maintenance, cabin crew and catering, etc. Since addressing of the communications is independent of the medium between the aircraft and the ground, multiple virtual dialogs can be established and multiplexed over common aircraft-ground media.

It should be noted that the packet voice concept is different from the Voice over Internet Protocol (VoIP) service offered as an alternative to conventional telephone voice networks. VoIP provides a full-duplex virtual telephone circuit over the IP medium, whereas the packet voice concept provides a half-duplex packet service over an inter-network, matching the design with the requirement. The result is a very efficient way of using valuable air-ground radio frequency spectrum, providing better service than is possible with current or other proposed services and making optimum use of available radio spectrum.

Introduction

Numerous studies have shown that the air traffic services (ATS) community is running out of VHF communication channels. This is especially true in the Northeast corridor of the United States and in Western Europe. The European solution has been to sub-divide the 25 kHz VHF voice channels into 8.33 kHz channels. The solution chosen by the United States is to multiplex up to four digital channels onto a single 25 kHz VHF channel, which can be used for either digital voice or data.

In oceanic and other remote parts of the world, the available voice communications for ATS is over HF SSB, which suffers from noise and variable propagation problems. Some of these problems have been mitigated for the air traffic controller by using radio operators to serve as a buffer between the controller and the radio. The airplane pilots, however, must continue to bear the burden of communicating in this bad environment. In other than polar regions, satellite voice has been developed, although it has not been put into regular use.

This paper describes a new concept of communication in an environment where push-to-talk half duplex communications is most appropriate. Some basic concepts of communication are first introduced and some common, but erroneous, assumptions about communications are discussed. The packet voice concept is then introduced. Some of the advantages, system design issues, and other considerations are raised.

It should be noted that, although this paper is expressed in terms of pilot-controller communication over a radio air/ground link, the concept may very well be equally applicable for communication between other entities, over either radio or wired links. Communication between the aircraft crew and ground-based airline personnel is a prime example.

1 Current ATC Voice Radio

The pilot and controller normally communicate via VHF analog voice radio. A VHF channel is assigned to a particular air traffic control (ATC) sector. When an aircraft approaches that sector, the pilot is directed to tune the aircraft radio to the channel assigned to the sector.

When a pilot or controller needs to transmit a message, (s)he first listens to the channel and waits for a quiet period in the traffic on that channel. For most channels, the transmissions are short and traffic is light enough that the wait is relatively short. In some highly-congested channels, however, the wait can be 30 seconds or more. During this time the pilot or controller must hold the message in a mental queue and give partial attention to the voice traffic. Judgment and experience are applied to determine whether a silence is the appropriate break for the sender’s own message or if it is only a turn-around in an ongoing dialog. At the appropriate time, the pilot or controller presses the push-to-talk (PTT) switch and states the message, as rapidly and succinctly as possible, to minimize channel occupancy. Procedure dictates that the call sign of the intended receiver, followed by the call sign of the originator, be included at the beginning of the message, e.g., “Seattle Center, United 234…” S(he) then listens to the channel for indication that the message was received without corruption. Failure to hear a valid acknowledgement might mean that two transmitters were activated at the same time and both failed to communicate or it may mean that the receiving person failed to hear the message. In either case the message must be retransmitted.

In addition to the air traffic control channel, pilots need to communicate with other ground entities. U.S. carriers are required[1] to have communication capability with their dispatch center. Other carriers have similar requirements, either by law or by standard operating procedure. General aviation flights also have the need to contact ground service entities for weather and other operational messages. As a result, most aircraft are equipped with a minimum of two VHF communication radios and many have more radios to support their routes and operational needs.

HF radio does not have effective squelch, so the background noise is always present. The VHF radio for operational communications (AOC) has low usage for any one aircraft but traffic for other aircraft is present. In order to allow the flight crew to monitor these additional radios without the additional aural workload, a form of tone annunciation called SELCAL is provided. A call on a channel guarded by SELCAL requires that the originator enter the aircraft SELCAL address before making the initial call. A light and/or a tone annunciates an incoming call. The flight crew, when they see or hear the annunciation, activates the receiver audio and responds.

2 Basic Communication Concepts

The communication community has defined two basic methods of communication. Circuit mode communication describes the case where a physical or virtual circuit is established between two nodes, such as two telephone handsets. The full resources of that circuit are dedicated to carrying information between those two nodes, whether the information is speech or music or the source is silent. By comparison, packet mode communication takes a group of information—a packet or message—and sends it from the origin to the destination, independent of any previous or subsequent packets.

Another key concept in communication is that of full-duplex vs. half-duplex communication. A telephone conversation is conducted over a full-duplex channel. Both ends of the conversation can talk and listen simultaneously, at least technically. By comparison, two-way radio communication such as used by ATC operates in half-duplex mode. Since both sources use a common channel, only one source can be transmitting at a time. Optimally, the sources alternate.

Since two, and typically more, sources have the need to send information over a channel, another communication concept, of multiple access protocol, is important for consideration. With a collision (or carrier) sense multiple access (CSMA) protocol, a source desiring to send a message listens to the channel and, when it determines the channel is free, sends the message. Another scheme for allocating the single channel to multiple senders is a token passing protocol, where a sender is allowed to send information only when in possession of a token. A time domain multiple access (TDMA) protocol is a special form of a token-passing scheme, where a virtual token is “passed” to each sender on a regular schedule and the sender only holds the token for a specific period of time. There are various schemes to assign the channel to those in need while not burdening the channel with unused time slots or with excessive overhead to handle the token. Frequency domain multiple access (FDMA) and code division multiple access (CDMA) are other protocol schemes to allow multiple users to share a common radio band. In any case, each protocol includes a method of acknowledgement to ensure the message was received and error-free.

3 Characteristics of ATC Voice Communications

Two-way radio communications, such as used for ATC, can be described using many of these concepts.

ATC voice is inherently a packet communication protocol. That is, the pilot or controller has a discrete message to send when (s)he presses the transmit switch. During the course of the relationship between the aircraft and the control sector there may be a number of individual messages but each message is independent of any other, related only by the address pair of the sender and receiver. A circuit mode communications link provides no value to the process, especially if a circuit setup would add to workload and/or time delay for messaging. (A connection-oriented transport layer or other mechanism may be provided for the purposes of efficiency and integrity but that is transparent to the users at both ends.) Also, as has been illustrated by development of SATCOM voice, establishing a telephone-type circuit between the pilot and controller consumes precious (and expensive) resources. The voice compression algorithms of aviation Codecs proposed thus far are based on telephone industry techniques, which are optimized for circuit mode voice. They may (or may not) be optimum for a packet mode voice service.

ATC voice is inherently a half-duplex service. There is no requirement for the pilot and controller to talk simultaneously. A full duplex service would waste the unused bandwidth. On the other hand, either of the parties may need to send a priority message in spite of the fact that the other is currently transmitting. A pilot may declare an emergency or a controller may recognize an unsafe situation and need to make a transmission to resolve it. As will be seen, there are more appropriate ways than full duplex service for such an eventuality.

ATC voice communication, as practiced today, uses a CSMA protocol. This protocol is not implemented in hardware or software, however. Rather, the people controlling the push-to-talk button must be the protocol engines. The pilot or controller must listen for a break in the traffic on the channel and then transmit his message, hoping that it will go through without error or interference. In high-intensity airspace, such as the terminal airspace of major airports, the channel may be so loaded that a pilot will have to wait up to 30 seconds before he can make a transmission. The voice protocol includes error detection methods, including clearance read-back, acknowledgements such as Roger and Wilco, and the feedback of a clearance seen by the controller when compliance is monitored on the radar screen. All of these protocol elements are error-prone, require mental workload that could be better used elsewhere, and contribute to message delay.

4 Transit Delay

The amount of time required for a message to be delivered is called transit delay. To compare alternate communication concepts, transit delay should be measured from the time a message is ready to be sent until the time the receiver hears or reads the message. As shown above, transit delay in a high-intensity environment can extend to as much as 30 seconds.

The MASPS for VHF Digital Link Mode 3 (VDL-3) states that the maximum delay for voice from the microphone to the headset/loudspeaker should be 236 ms.[2] This number was derived from a) the value technically achievable by the VDL-3 design and b) the value determined by human factors experts to be acceptable in order to minimize inadvertent conflict between two stations transmitting nearly at the same time (the CSMA problem). This is normally a small subset of the total transit delay.

The minimum acceptable transit delay independent of the CSMA problem has not been defined by industry. RTCA SC-189 and EUROCAE WG-53, together with the ICAO OPLINK Panel, have defined the delay parameters but have not set the values required for those parameters. Transit delay requirements are a function of the control sector environment and of the tempo of operations in that sector and comprise one of the parameters in Required Communication Performance (RCP).

5 The Data Link Concept of Routing

The current air traffic voice communication concept defines the voice path to be used between the aircraft and the controller, which is set by the pilot to the technology parameters (VHF or HF frequency and/or channel) published for communication with a particular control sector. This concept creates some very heavily loaded channels, while other channels are more lightly loaded.[3] The current concept also mandates that each sector suite must be assigned a channel for each technology to be served (e.g., VHF and UHF) and that an aircraft is barred from a sector unless equipped to talk on a published channel of that sector (e.g., 25 kHz DSB-AM, 8.33 kHz DSB-AM, VDL-3).

By contrast, data link in general and the ATN in particular is defined to use the best available path, independent of the end users. An airborne VDL-2 or ACARS station might be directed to another, more lightly loaded, frequency in order to prevent overloading of a channel. The ground station of that new frequency may be co-located with the original ground station or it may be anywhere within radio range of the aircraft. Data link routing also enables transparent migration between technologies, moving between ACARS, VDL, SATCOM, HFDL, and any new technologies, as circumstances require.

Proposed Packet Voice Concept

The general idea of the packet voice concept is that the pilot (or controller) would press the microphone key and speak the message when ready, without needing to ensure that an open voice radio channel is available. After the pilot or controller releases the microphone key (or sooner in the case of a longer message) the system would digitize the voice signal, compress it, and then send the message over the airplane-ground channel. The system, not the humans, would be responsible for finding an appropriate channel; sending the packet(s) of data as determined by the protocol of that channel; and re-sending the packets, or the entire message, if the receiver fails to acknowledge receipt.

The digitized and compressed voice message would include the address of the source and the destination, appropriate security authentication information, priority assertions, and any other required information.

1 Human Factors

Any technical solution is useless if we fail to consider how the humans it was designed to support might use the system. The following are descriptions of the unique human interface characteristics of this concept.

1 Microphone and Headset/Speaker Interface

The microphone and speaker interface are identical to today’s system, both in the aircraft and at the controller’s workstation. The difference is that the pilot/controller does not need to wait for a break in the incoming messages to key the microphone and begin talking. When ready to talk, the pilot/controller just presses the microphone key and begins talking. The task of ensuring that the message is sent without interruption is performed by the system. In the case of an emergency, the pilot/controller can transmit immediately with confidence that the message will be appropriately processed and delivered to the destination.

In the cockpit, the only voice dialog is typically with the controller for the sector the aircraft is operating in. Therefore, identification of whether a particular message is meant for that aircraft is not an issue. The common cockpit conversation “Was that call for us?” will fade into history.

One side benefit of the packet voice concept is the capability to easily review previous incoming messages. This is especially useful when receiving a long message, such as a departure clearance, but is useful under other circumstances, as well. Such a capability is available now[4] for general aviation aircraft, but is more difficult to implement in an analog environment.

At the controller’s workstation, a way must be found to queue the incoming voice messages from the aircraft in the controller’s sector (and perhaps calls from adjacent controllers, if this concept is extended to that function) so they may be processed by the controller in order.[5] Incoming voice messages could be queued to the controller’s headset, with appropriate time gaps between messages. During the time the controller is keying his own microphone, incoming messages would wait in queue. A similar capability would be useful in the cockpit, although possibly not as critical.

2 Address Selection

Every aircraft and every ground station would have a unique address. The flight crew could select the address of the controller in a method somewhat similar to how they do it today, by entering an alphanumeric string into a keyboard. Entering such a worldwide-unique address would be workload-intensive, however. A preferable solution would be to uplink the address of the next sector controller, such as is already defined within the CPDLC specification. In addition to the machine-readable address, the human-readable name of the particular air traffic control center or sector should be available to the flight crew.

At the controller’s workstation, each aircraft would be uniquely identified by the voice station address. This could easily be coordinated with the call sign of the aircraft and/or flight number and the ICAO 24-bit address. With this information, a simple and potentially effective addition to the current controller’s display might be an annunciation on the aircraft data block while the controller is hearing the voice. Since initial contact is made by the aircraft the controller would not need to enter aircraft addresses.

3 Annunciation of Support Data

Since the packet voice concept uses a data communication technique to move packets of voice, any of the data communication concepts already being considered may be applicable in support of packet voice. Authentication of the source of a message is considered by many to be the minimum level of security that should be applied to future communications. Another potential benefit might be to annunciate if the asserted priority of the voice message is anything but normal.

Another way to look at the convergence of packet voice and data link is that packet voice is just one more application of a data communications system.

4 Party Line Effect

The current voice environment, with multiple aircraft listening on the same channel, has advantages and disadvantages. It has generally been considered an advantage that pilots can hear the clearances given to aircraft ahead of them and can anticipate a similar clearance. Pilots are also able to hear any weather observations made by fellow pilots in the sector.

Party line effect has disadvantages, as well, however. The conversations of multiple aircraft with the controller add to aural workload in the cockpit. Calls for an aircraft are often mistaken, resulting in a flight crew responding to a clearance given to another aircraft or failing to respond to a clearance to their own aircraft. The confusion in the cockpit is typified by the standard phrase “Was that for us?” Flight crews sometimes anticipate their clearance based on the clearance given to the preceding aircraft to the point that they follow the expected clearance rather than the clearance actually received.

The current standard protocol of stating the called station, followed by the calling station, may be redundant in the addressed communication environment of a packet voice system or it may be considered a valuable cross-check.

2 Voice Compression and Decompression

Analog voice is converted to a digital signal by first sampling and digitizing the signal, then by applying the resulting digital bit stream to a coder/decoder (codec), where the digital bandwidth is reduced. At the receiving end, the signal is run through the codec to recover the bit stream and is then converted back to an analog form.

There are, in general, two forms of codec. Source coders try to compress the signal from knowledge of how speech is created. In this form of coding the creation of speech is modeled and parameterised. Generally, this type of coding results in synthetic sounding speech and its performance is largely dependent on the characteristics of the individuals’ speech. Using this method it is possible to get transmission rates lower than 4kbps. Waveform coders, on the other hand, are concerned with replicating the original signal waveform. Consequently this type of coder is much more robust than the source coders and produces more natural sounding results. Two examples of this type of coder are Adaptive Differential PCM (ADPCM) and Continuously Variable Slope Delta (CVSD).

Almost all of the current voice digitizing technology assumes a continuous stream of information, whether the field of endeavor is telephony or computer audio and video. Packet voice, on the other hand, has the added characteristic that the message is normally short and concise. Therefore, batch file compression techniques, such as LZW[6], ZIP, and TAR, may provide a way to more tightly compress the digital packet for transmission while retaining all of the original information. Research could be performed to determine more efficient ways of compressing the packet voice signal.

One of the advantages of using data communications technology is that one or more standard technology codecs can be specified for packet voice and additional codecs can be provided as the state of the art advances. The data communication protocols provide the capability to negotiate services such as compression based on the best technology available in the corresponding pair of stations. For example, a quick check of the author’s computer showed eleven different audio codecs, in addition to video codecs. Progress does not have to stagnate based on available technology at the time of the previous revision of SARPs.

Compression of digital voice provides the very real possibility that a voice packet can be transmitted over the air-ground medium more rapidly than the words are spoken into the microphone or heard on the headset, minimizing the time of channel occupancy. This provides another example of how the packet voice concept improves channel use compared to current or proposed voice service. On the other hand, experience has shown that HFDL is capable of transporting data, albeit at slow bit rates, over radio channels that are incapable of use for analog voice due to propagation problems. In that case, the voice packet may take longer than it took to say the message but the HFDL protocol will ensure that it arrives reliably and will be intelligible.

3 Routing of Voice Packets

The packet voice application is another data communication application. Therefore, any of the routing capability provided for data communication is also available for voice communication. In typical domestic en route and terminal airspace, VHF Data Link is presumed to be the most appropriate medium for voice and data communication. The C-band data communication channels being discussed by the aeronautical community are assumed to provide excellent performance over a shorter range than VHF. Therefore, as the aircraft approaches the airport (and during the initial departure from an airport), the C-band radio could be used since it provides improved quality of service. For flight in oceanic and remote airspace, satellite or HF radio links might provide the only available communication path. While the aircraft is parked at the gate, a short-range GateLink might be used, further reducing the load on longer-range channels. Unlike the current system, the medium over which the voice packets are transported is independent of the location of the person with whom the aircraft crew is communicating. Selection of the medium is done automatically, without intervention on the part of either the flight crew or the controller.

If the concept is extended beyond air traffic and aircraft operations (safety services), some air-ground links may be inappropriate for non-safety services. The ATN routing definition already provides for the selection of the appropriate air-ground link for each message. The routing concept also provides for the capability to restrict safety service messages to only authorized air-ground links, if that is required.

4 Channel Loading of Radio Networks

Selection of the particular channel for packet voice over VHF, C-band, or HF would be independent of the controller location, just as it is today for ACARS, VDL-2, and HFDL. The radio channel protocol is based on optimum use of channel bandwidth and is not constrained by air traffic sectorization. Selection of a particular channel is based on the need to share the load among all available channels, optimizing the performance for all. Selection of a particular ground station is also made to optimize the path between the aircraft and the ground, unbounded by the identity of the control sector or other ground entity exchanging voice packets with the aircraft.

5 Other Uses for the Packet Voice Concept

Although air traffic control is a primary user of VHF and HF communication bandwidth in the aeronautical spectrum, there are other users that can also benefit from this concept. The flight crew can talk with company dispatch or maintenance personnel on the ground by selecting the address of the appropriate ground station. The cabin crew can similarly talk with peer service entities on the ground. In both cases, selection of the necessary address is all that is required. The need to “patch me through to maintenance” would be eliminated. There is no need to select a particular radio or a specific frequency on that radio. The cabin crew, for instance, would not need to be aware of whether the aircraft is within direct line-of-sight of a ground station because the voice packets would be routed by the best available path. Similarly, a ground entity could contact an aircraft without foreknowledge of the location of that aircraft.

Because the packet voice concept does not require dedication of a radio channel for to one ground entity, it is possible to multiplex packet voice services with multiple ground services. An airline flight crew can communicate with maintenance or dispatch while continuing to monitor air traffic control. Similarly, a general aviation aircraft can call for weather or report on weather while remaining in contact with air traffic control. A queuing system similar to that described above for controller workstations would ensure no message was missed.

The packet voice service is similar in some ways to the iDEN radio service being offered to North American airlines by ARINC.[7] The ARINC service is primarily directed toward baggage handlers and maintenance personnel on the ramp, who may need to talk with support personnel either at the local station or at another airport. The same service is provided to police, fire, and ordinary subscribers in North America as the NEXTEL Direct Connect service. This service is, of course, provided over only one (sometimes two) radio band(s) and has other aspects unlike the packet voice concept. As with all cellular telephone services, radio channel selection is transparent to the users.

Conclusion

The meeting is requested to consider the packet voice concept as a potential new technology to provide improved air traffic communications.

-----------------------

[1] U.S. 14 CFR 121.99

[2] RTCA DO-224AChg2. The earlier DO-224A stated the value as 200 ms.

[3] It should be noted that VDL-3 data link suffers from a similar problem. A single voice channel is to be assigned to a sector. In addition, a data link channel on the same frequency would be assigned for operation while within that sector. A sector that is heavily loaded in voice might be presumed to also be heavily loaded in data, while another frequency connected to a nearby ground station might be lightly loaded. Transfer to a new voice channel at the sector boundary, unless it is also on that same frequency, will cause termination of both the voice and the data link and require establishment of a new voice and a new data link upon arrival at the new frequency.

[4] One of the more recent offerings is in the Garmin 1000 avionics suite, where the function is called “clearance playback”.

[5] Many airport public address systems have solved a similar problem. The agent presses the appropriate buttons on a telephone and speaks the announcement into the handset. Some time later, when any existing announcements over the local loudspeakers are concluded, the announcement is heard in the designated area.

[6] The lossless compression file formats of GIF and TIFF are derived from the Lempel-Ziv-Welch (LZW) algorithm.

[7]

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download