Operation and maintenance:Queuing theory

Queuing theory

79.1 Purpose

The purpose of this chapter is to provide the fundamental concepts and computations required for the analysis of any system that features a waiting line (queue). Examples of enterprises where these ideas are traditionally applied include grocery stores, airline ticket counters, fast-food restaurants, retail stores, auto license agencies, and banks. Queuing theory may also be used where “customers” are inanimate objects, such as production processes, though sometimes the mathematics becomes quite complex in these situations. In the context of information systems, queuing theory is commonly used to help plan, design, and reconfigure communication networks.

79.2 Strengths, weaknesses, and limitations

Queuing analysis is a mathematical technique for analyzing waiting lines. In a simple waiting line, a customer arrives, joins a queue, is serviced, and leaves. That pattern is consistent with messages flowing through a network, disk access transactions, interrupt processing, and several other activities common to computers. Consequently, queuing theory is an excellent tool for modeling such activities. Given the correct statistical values, a queuing analysis can be completed rapidly using only an electronic calculator or a relatively simple computer-based model.

For some combinations of arrival pattern, service time distribution, and number of servers, measures of performance are limited or unavailable. Queuing theory assumes exponential arrival and service rates and, consequently, should not be used if either rate is clearly not exponential. Other inhibiting factors are prioritization of customers, a service rate that varies by size of the waiting line, a restriction on queue length, and complexity of system design. However, a computer simulation model (Chapter 19) can be constructed easily to produce estimates of system performance measures in most of these situations.

79.3 Inputs and related ideas

Queuing theory is commonly used to support planning, designing, and reconfiguring a network. Relevant network concepts are introduced in Chapters 52, 53, and 54. Hardware interface design is introduced in Chapter 42. Memory queues are described in Chapter 43. Simulation, another mathematical tool used to support similar applications, is discussed in Chapter 19. Performance analysis is discussed in Chapter 78 and system maintenance is discussed in Chapter 81.

79.4 Concepts

The purpose of this chapter is to provide the fundamental concepts and computations required for the analysis of any system that features a waiting line (queue). In a simple waiting line, a customer arrives, joins a queue, is serviced (by a server of some kind), and leaves. Queuing theory uses the arrival rate and the service rate to quickly compute such statistics as the time a customer spends in the queue and in the system, the expected line length, and the expected number of customers in the system.

In the context of information systems, queuing theory is commonly used to help plan, design, and reconfigure communication networks, with each message representing a customer and each server, router, or other device that holds and forwards or otherwise manipulates messages representing a server. Other applications include disk access, printer spooling, interrupt management, process queuing, and so on. In all these activities, messages (customers) arrive at random intervals and are held briefly for processing by a server of some kind.

79.4.1 Bottleneck analysis

One of the more common information system applications of queuing theory is bottleneck analysis, the study of choke points or bottlenecks in a network. Bottleneck analysis is an important network routing tool (Chapter 54).

As messages move from node to node across a network, they are held and forwarded by one or more nodes. As message volume increases, the number of messages waiting for a given server can grow exponentially, creating a bottleneck that slows or even halts traffic. Queuing theory can help predict such bottlenecks by allowing the network analyst to mathematically identify the network nodes most sensitive to message volume. When a bottleneck occurs, the network analyst is typically faced with numerous alternative solutions. Using queuing theory, the analyst can often quickly eliminate all but a few of those alternatives by making the changes mathematically (in a matter of seconds) and studying the impact on queue length and average wait time. Note that queuing theory does not necessary yield the answer. Instead, it helps the analyst quickly narrow the set of feasible solutions to a workable number.

79.4.2 Terminology

The fundamental result of a probabilistic experiment is an outcome. Examples in queuing analysis are number of customers in line and interarrival time. The set, or collection, of all possible outcomes is known as a sample space, which may be either finite or infinite according to whether the number of outcomes is countable. If there were a physical or practical limit on queue length, using six as an example, the finite sample space would be comprised of the integers 0, 1, 2, 3, 4, 5, and 6. Interarrival times form an infinite sample space consisting of the positive real numbers since time may be subdivided indefinitely.

An event is a set of outcomes. The probability associated with event A is the ratio of the number of outcomes inA to the number of outcomes in the sample space. The interpretation of probability for the purposes herein islong run relative frequency. That is to say, if A denotes “two customers in line” and prob(A) = 0.27, then an observer of the queue over a very long period would see event A occur 27 percent of the time (assuming the process is stable).

A random variable is a symbolic representation of an outcome. Discrete random variables are associated with finite sample spaces, their continuous counterparts with infinite sample spaces. A probability mass function (pmf) is the mathematical relationship between the various values that a discrete random variable may assume and their probabilities of occurrence. A hypothetical probability mass function for the queue length (denoted N_q) example above is shown in Table 79.1. A continuous random variable is defined by a probability density function (pdf) specified on an interval, such as

The probability that x lies on the interval a to b is the definite integral of f(x) evaluated between those numerical values.

The rth percentile of a probability distribution for random variable x is a number x_r such that prob(x ≤x_r) =r/100. In the distribution for N_q above, the 90th percentile is 5 because prob(N_q ≤5) = 0.94. [Prob(N_q ≤4) = 0.84 makes this queue length incapable of being the 90th percentile.] For the continuous random variable above, the 90th percentile, 2.735, may be obtained by applying the calculus and algebra.

A random variable (call it y) has two important descriptive parameters, E(y) (expected value or mean), a measure of central tendency, and Var(y) (variance), a measure of dispersion about the mean. For the discrete case, the formulas are Analogous formulas using the calculus exist for continuous random variables. The mean and variance of N_q based on the distribution above are 2.74 and 2.4524, respectively. The continuous random variable x has mean and variance 1.333 and 0.889, respectively.

79.4.3 General ideas

In the simplest of queuing systems, operating characteristics may be derived from knowledge of:

1. The customer arrival process.

2. The service process.

3. The number of servers available.

Some situations featuring greater complexity are discussed briefly in Sections 79.4.5 and 79.4.6, below.

The exponential distribution is of fundamental importance in queuing analysis. Its probability density function is:

f(t) = λe^-λt,
t > 0,

= 0,
otherwise,

where t stands for time and λ is the rate of the process under study. The rate is always specified with reference to a time unit, such as 3 per min, 5 per h, and 12 per d. The mean of an exponential distribution is the reciprocal of λ, and remarkably the variance is precisely the square of the mean. The graph of this distribution (Figure 79.1) is asymmetrical, descending rapidly from a height of λ on the vertical axis, then curving suddenly and moving slowly toward, but never reaching, the horizontal axis. Of the values on an exponential random variable, 63 percent lie below the mean (in contrast to 50 percent for the normal distribution).

Notation for the simplest of queuing systems is of the form a/b/c, where a identifies the interarrival time (IAT) distribution, b the service time (ST) distribution, and c the number of servers. In the next section, formulas and an example for the M/M/1 queue are given; “M” is the designation for exponential distribution in queuing parlance. The M/M/1 designation indicates exponential arrivals, exponential service times, and 1 server.

79.4.4 The M/M/1 Model

In this system, the arrival process draws customers at rate λ from an unlimited population. Customers move from the queue into service via a first come, first served discipline, and the exponential service times have mean 1/µ(implying also rate µ). Note that both the arrival rate (λ) and the service rate (µ) are exponentially distributed. For steady state probabilities to exist, or to ensure that the queue will not grow to infinite length, the traffic intensity (=λ/µ) must be less than 1.

Steady state probabilities for the M/M/1 model are calculated as follows, where n refers to the number of customers in the system (queue plus service):

Figure 79.1 An exponential distribution.

Important descriptive operating characteristics include the expected time in queue, time in system, line length, and number (of customers) in the system:

where W is expected time in system [e.g., (4)], ln stands for natural logarithm, and ρ is the system’s traffic intensity. These percentiles give likelihoods that large waiting times and times in system will occur, knowledge of which is a useful adjunct to means for decision-making purposes.

As an example, consider a single chair barber shop operated by one Harry Schaive. On an ordinary morning, customers arrive at the rate of 4 per h, and the mean time for a haircut is 12 min. Based on extended observation of the shop’s operation, Harry’s brother, Klaus, an industrial engineer, has concluded that an M/M/1 model describes the system adequately.

The information provided show that λ = 4 per h and µ = (1 per 12 min) = 5 per h. With ρ = 4/5 = 0.8, the formulas above yield:

p₀ = 0.20,
p₁ = 0.16,
p₂ = 0.128,
p₃ = 0.1024,

for the probabilities that 0, 1, 2, and 3 people are in the system. Thus Harry is idle (0 people in the queue) 20 percent of the day and an arriving customer has a 0.8 probability of having to wait before having his (or her) hair cut. Note also that prob(n ≤3) = 0.5904 (by addition), so there is a substantial chance of finding four or more customers in the shop. The following quatities [equations (3) - (6)] reveal the congested condition of this system:

W = 1 h,
W_q = 0.8 h = 48 min,

L = 4 customers,
L_q = 3.2 customers.

The ratio of mean time in queue to mean service time is four (48/12). Interest in this number is based on the idea that a customer’s tolerance for waiting is related to the time already committed to service. A high ratio translates to discontent over the delay involved. Finally, computing the high percentile times in the queue and in the system [equations (7) and (8)]:

π_q[90] = 2.079 h = 2 h 5 min,

π_w[90] = 2.303 h = 2 h 18 min,

reveals the astonishing fact that 10 percent of Harry’s customers must endure a time in line exceeding 2 h 5 min!

In a network model, the arrival rate (λ) might be several messages per second and the service rate (µ) might be 1000 or more messages per second, but the computations are much the same.

79.4.5 Other models

One solution to the congestion in Harry’s shop is to hire another barber for a second chair. This would produce an M/M/2 model, a special case of the general M/M/c queuing system. Solving the equations:

p₀ = 0.429,
p₁ = 0.343,
p₂ = 0.137,
p₃ = 0.055,

W = 14.29 min,
W_q = 2.29 min,

L = 0.952 customers
L_q = 0.152 customers,

π_q = 8.27 min,

suggests a vast improvement (at least from the perspective of the customers).

If Harry chooses not to increase his serving capacity, it is possible that customers will start to balk (leave without joining the queue) if the queue is sufficiently long. This would give rise to the M/M/1/K model, where K is the limit on number in system (imposed either by physical operating conditions or by customer behavior). For example, the number of bytes in a memory queue imposes an absolute limit on the length of the queue. The M/M/1/K/Kqueue is known as the machine repair problem. The population consists of K machines that break down randomly to be repaired by a lone technician. Naturally, there is a limit of K “customers” in this system. Steady-state formulas exist for these models and their multi-server counterparts. The Erlang-k random variable, whose distribution is denoted E_k, is the sum of k identical exponential random variables. Operating characteristics for the commonly observed M/E_k/1 and E_k/M/1 models can be readily computed.

79.4.6 Ancillary issues

Whether to add a barber in the example above would depend on cost considerations. The following simple formula may be used to determine the total cost (TC) of operating a queuing system:

TC=c_wL_q + c_sk,
(79.9)

where c_w is the cost of one customer’s waiting time per hour, c_s is the cost per hour of employing a server, and kequals the number of servers. If it is believed that there is a cost attached to the customer’s time in service, thenL_q (the mean line length) should be replaced by L (the mean number of customers in the system).

Close examination of equations (3) – (6) reveals the following relationships which apply to all queuing systems:

L = λW,
(79.10a)

L_q = λW_q,
(79.10b)

where λ is the arrival rate, W is the expected time in the system, and W_q is the expected time in the queue. These equations comprise Little’s rule, named for the distinguished MIT professor who discovered them. Since W = W_q+ (1/µ), the last term being expected service time, knowledge of λ, µ, and any one of the queue characteristics therein allows computation of the other three.

There are many waiting-line situations that cannot be analyzed using formulas for any particular model. Simulation (Chapter 19) is a methodology capable of handling such problems, as queuing systems in series (unless interarrival time and service time are exponentially distributed), prioritized customers, jockeys (line changes), reneges (departures from the queue without receiving service), bulk arrivals (customers appear in groups of 2, 3, 4, . . .), bulk service, service times dependent on line length, and probabilistic balks.

79.5 Key terms

Arrival rate —: The number of arrivals per unit of time.
Balk —: To leave without joining the queue.
Bottleneck analysis —: The study of choke points or bottlenecks in a network.
Event —: A set of outcomes.
Interarrival time —: The elapsed time between arrivals of successive customers in a queuing system.
Model —: An abstract, mathematical representation of a physical system.
Outcome —: The fundamental result of a probabilistic experiment.
Probability mass function (pmf) —: The mathematical relationship between the various values that a discrete random variable may assume and their probabilities of occurrence.
Queue —: A waiting line.
Random variable —: A symbolic representation of an outcome.
Sample space —: The set, or collection, of all possible outcomes.
Service rate —: The number of customers served per unit of time.
Service time —: The time a customer spends receiving service in a queuing system.
Steady state —: A condition representative of a system’s long run behavior; for example, an assembly line starting without parts in process will be in a transient state until such time that the various stations are being utilized at approximately their expected levels, at which point steady state has been achieved.
Stochastic process —: A process organized into states in which movement from state to state is governed by probabilities; examples include the number of customers in a queuing system, levels of inventory on hand, and brands purchased by consumers.
Waiting time —: The elapsed time between a customer’s arrival and the beginning of service.

79.6 Software

SAS, MiniTab, and numerous other statistical programs support queuing theory. The necessary computations can also be performed using spreadsheets and most standard programming languages.

79.7 References

1. Allen, A. O., Probability, Statistics, and Queueing Theory with Computer Science Applications,2nd ed., Academic, San Diego, CA, 1990.

2. Gross, D. and Harris, C. M., Fundamentals of Queueing Theory, 2nd ed., John Wiley & Sons, New York, 1985.

3. Hall, R. W., Queueing Methods for Services and Manufacturing, Prentice-Hall, Englewood Cliffs, NJ, 1991.

4. Hillier, F. S. and Lieberman, G. T., Introduction to Operations Research, 6th ed., McGraw-Hill, New York, 1995.

Search This Blog

information systems course