Constraint modeled in stochastic terms

I am referring to the note on page 65, which is a very interesting angle.

The probability distribution of completing a task definitely plays a role, especially when you have an “order” that is made up of multiple successive tasks.

The smaller the average task duration the worse the end result will be. Short task durations can be modeled as a Poisson distribution with a small lambda value.

The compounding affect of these long tail distributions negatively affects the final outcome.

i.e The time lost on a previous task is nearly impossible to be regained in a subsequent task when the average task duration is small.

However if the average duration of the tasks are longer (higher lambda values) the Poisson distribution approximates a normal distribution, and the final outcome will be closer to the sum of the means of each task.

(This is the corollary of the dice game in chapter 14 of “The Goal” for variability of task durations).

I am not sure this is useful, or that it can even be considered a “constraint”, even though at first glance, it would appear as if the final result was constrained.


Thanks for your thoughs, @bvautier. I am sure that there is a lot of TOC that can be reformulated in more rigorous mathematical / statistical terms. And it would be an interesting exercise to do, especially if that could reconcile in an indisputable way the premises of Little’s Law with those of TOC. I’d love to have some mathematician / statistician look into this. (My stats-fu was lost at a young age… ). In the meantime, I’m happy to keep a pragmatic angle on this, and just realizing that the improvements that you can achieve via Kanban get a real explanation through those pages I wrote. Kanban folks are happy to report their successes; but they cannot explain why they happen. The explanation, I think, is one of the major contributions of the book!

1 Like

What I found interesting about your note on page 65, is the concept of defining a constraint in terms of statistics, which challenges the concept that a constraint must be tangible or physical.

It opens the door to investigate intangibles such as “behavioral” and “variability” patterns that can end up reducing the throughput of a system more than a tangible one.

Look forward to reading more about this in the coming chapters.

1 Like

Yes, the idea of identifying the constraint statistically is new. The first instance of this is when I identify the Constraint in the Work Process, by looking at the longest average in-state Flow Times. So that’s a first very practical application.

The idea in that note, however, is not developed any further. It is more a research direction right now; and the objective would be to reconcile the “geometric” reasoning in Chapter 4 with the statistical/analytical foundation of Little’s Law.

1 Like

Finished chapter 4 and spent a fair bit of time thinking about it. (Very thought provoking :slight_smile:

Even though I really wanted to read it through the lens of mathematical variability I could not escape the similarities with manufacturing.

At first sight, knowledge-work and manufacturing appear different, however these differences could be resolved if we could uncover and better understand the (hidden) assumptions we use when analysing each domain.

There are some assumptions I had to make when reading chapter 4 that I would like to discuss with you. Please correct them if they are wrong.

There are two in particular:

1. Multitasking: What does it really mean to multitask and what makes it so bad?

The “context switching” cost for humans seems very similar to the cost of “setup” or “changoever” durations in manufacturing when different products need to be processed on the same machine.

We can have multiple demand orders from different customers, which can be broken down into multiple tasks. These tasks will need to be processed on various different machines.

The question is: Is it still multitasking if we process tasks from OrderA and OrderB at the same time (i.e. in the same “batch”), if it does not result in any delays?

This is important because we can potentially group tasks that do not require any setups or changeovers together, in order to achieve an Economic Batch Quantity (EBQ).

If this is the case, there are some interesting similarities that could be draw from Chapter 4 of “What is this thing called Theory of Constraints and how should it be implemented?” concerning EBQs.

It appears that “multitasking” as well as “batching” are words that have multiple meanings depending on the context.

Assumption: There are multiple projects in WIP, all of which have different tasks. Switching between tasks within a project as well as switching tasks between projects will incur a “time cost”.

Manufacturing assumption: A machine is only able to do a particular type of work. It may require some downtime to “set things up” or “clean things down”, but essentially the machine does a similar “type” of work. (With a reasonably well known rate of production).

Knowledge-work assumption: When it comes to humans, things get a lot more blurry. In knowledge-work, assigning work is different. We generally assign work to people that have “about the right” skills. Unlike machines we were not designed and purpose built to only do one type of task. (Having the right skills can be very blurry and is open to interpretation, whereas with machines it is not.)

The problem with knowledge-work, unlike machines, is that we do not have a good grasp of how long a task will take to complete. i.e. we do not have a good grasp of our rate of delivery. Especially if the characteristics of the tasks are wildly different, and require different skills.

Manufacturing is better equipped to assign tasks to the right machines, because there is limited choice, and things are better defined. Whereas, with knowledge-work we have more flexibility to assign tasks to people with different skills. What makes things worse in knowledge-work, is that we may not always be certain of the necessary skills required to complete a task (which can result in lot of lost time and effort), whereas in manufacturing this happens a lot less.

2. Wait time and multitasking time

It appears from the geometric diagrams that the wait time is directly proportional to the WIP that has been removed from the system.

To be clearer, are you assuming the system is simultaneously working on multiple projects at the same time, and that if you stopped working on one project (removed it from the WIP) then the wait time affecting the other project will be reduced by the time it would have taken to complete the removed project on its own?

My confusion comes from the graph on page 76, which seems to refer to the time lost to multitasking (i.e. the black bars) as “setup” or “context switching” time (i.e. no productive work is being done).

Let me try and explain my interpretation of the graph on page 76, to see if I understand it properly.
A project should take 100 units of time (let’s assume minutes) to complete (I know it is in percentages but let’s use real numbers to make this easier), if we were only working on this project.

If we have two projects that took 100 minutes each to complete, we could finish both projects in 220 minutes. (i.e. 20 minutes to context-switch), if we focused on one project at a time.

However, if we chose to multitask, and switched projects every 40 mins, it would take us 280 minutes to finish both projects.

If we have 5 projects that took 100 minutes each to complete, we could finish them all in 580 mins, if we focused on one project at a time.

However if we chose to multitask, and switched projects every 4 mins, it would take us 2,420 mins to complete all 5 projects.

The strategy we use to complete the projects is important.
In one case we can complete 5 projects in 580 mins.
In the other case we complete 5 projects in 2,420 mins.

What I am not entirely sure about is what assumptions concerning strategies are you using on page 80 and how many projects are currently in WIP?

What I find interesting is that the adopted strategy effectively acts as a constraint.

Strategy in manufacturing is hugely important (as demonstrated by ToC), but it seems it could be even more important in knowledge-work, mainly because it is one thing to have variability in product mix, and quite another to have variability in capability mix. It creates infinitely more possibilities and options to get things wrong.

1 Like


Very interesting thoughts … namely those 2 :

This is important because we can potentially group tasks that do not require any setups or changeovers together, in order to achieve an Economic Batch Quantity (EBQ).

I would like to transpose those into a Throughput Accounting perspective for decision making …

EBQ is of prime importance at the constraint. But the non constrained resources should never use EBQ so as to not overproduce (subordinate to the constraint) and increase WIP in the system. This would be contrary to what we are aiming for. Non constraint only have to produce to support the constraint. Never to ‘dumb’ WIP in the system due to the best practice of EBQ.

So, in that context, non constrained processes can have as many ‘setups’ as required by the constraint to produce what is strictly sufficient to maintain the heartbeat of the constraint. Setting up 5 or 6 times instead of once as per EBQ is the preferred solution for all non constrained steps. (Obviously non constrained resources must find ways to reduce setup times to be very adaptive to the constraints needs. That would be a nice use of slack capacity.)

Very nice thoughts in your post. I enjoyed reading it.

“The time lost on a previous task is nearly impossible to be regained in a subsequent task when the average task duration is small.” This seems to be a variant on the Gambler’s Fallacy. If Task A takes longer than expected, then there is no reason to believe that Task B will somehow be shorter to compensate.

If one flips a coin three times and gets heads each time, then there is no affect on the fourth flip. The fourth flip is still just as likely to be heads as it ever was.

Having longer tasks does not provide the capability for the work to be completed more quickly. Longer tasks can only provide hiding places for wait delays. It is these hidden wait delays that can be squeezed to make it appear that the task has somehow been completed faster. In normal operations, however, the organization lives with the added wait delay.

1 Like

Hello @WayneMack! Welcome here!

1 Like


The effect of Multitasking in knowledge-work is far worse than in manufacturing. In manufacturing, as you say, it is mostly about switching and setup times.

In knowledge-work, even an innocent 30-second interruption, can make your thoughts that took hours to build up in your mind, evaporate into nothing. That’s a bit the sense of the diagram of Gerald Weinberg.

But don’t be fooled by that diagram. The “black” segments are NOT Wait Time (as we refer to it in the chapter) for the Work in Process. The black segments are “unproductive” time for the worker. So while I might be context switching for 1 minute, but I am juggling 5 items, that one minute of mine will translate into 5 minutes of Wait Time for all work.

Note: you are referring to page numbers… but in what format? They seem to be off by 2 to the PDF I am using. I believe that what you see on page 80 is the complicated diagram where I try to explain how a 4X is possible. The key point of that diagram is understanding how the presence of Herbie multiplies (positively or negatively) the presence/absence of Wait Time. So it is not exposing any strategy in particular, but illustrating the impact of the Constraint on Wait Time.

Yes, the policies used to manage work (“strategy” is maybe overkill here! :slight_smile: are absolutely fundamental, both in manufacturing as well as in knowledge-work. In knowledge-work they are more difficult to handle, primarily because of biases and psychological factors; not because the “physics” is any different - but the “psychic” is!

1 Like

@WayneMack and @tendon

It has more to do with Kingman’s equation

Let’s assume we are in a position where we can control the coefficient of variation for arrivals so that it is essentially equal to 0. (i.e. we have a gate keeper that closely regulates what work is allowed to enter WIP, so we can keep the standard deviation of arrivals at 0.)

The average wait time will be directly proportional to the square of the coefficient of variation for service times (for a given utilisation).

What this means is that the average wait time for a Poisson distribution for lambda = 1 is 10 times longer than if lambda is equal to 10.

(It is easy to work with a Poisson distribution because the mean and variance are the same.)

This is significant (to say the least). The implication derived from Kingman’s equation is that shorter task durations that follow a Poisson distribution naturally incur a significant penalty. And this has nothing to do with multitasking.

We are just looking at a single server queue which are processed on a “first come first served” basis.

What get’s interesting is that if we look at the “flow time” (= wait time + mean service time), we get a different ratio depending on the utilisation.

If the utilisation is 0.5 then the time to complete 10 tasks with a lambda of 1 will take 1.43 times longer than 1 task with a lambda of 10.

However as the utilisation get’s closer to 1, the 10 tasks with an average duration of 1 units of time will take nearly 10 times longer than 1 task with an average duration of 10 units of time. (Constraints typically have a utilisation close to 1.)

Again we haven’t even looked at multitasking yet.

What makes this even worse is that the coefficient of variance for humans is significantly worse than that of machines.

Humans tend to exaggerate their skills and capabilities (variability in capability mix) whereas production rates for machines are generally better know and more reliable.

All I am suggesting is that good improvements could be achieved by matching tasks with the right skills, and making sure the tasks did not have too much variation so we could get a reliable number for the service rate for the person/task-type combination, and make those tasks as long as possible, to stack the odds in our favour. (i.e. A person should specialise in one type of work and only do that type of work. Even more so if they are the constraint as their utilisation will be close to 1.)


Yes the book seems to be 2 pages out with the pdf document. When I mentioned page 80 it corresponds to page 78 in the pdf.

I am not sure how the WIP reduction on page 80 (in the book) reduces the wait time.
Is it because with less WIP in the system, there is less chance of wasting time due to context switching (multitasking)?

LOL :slight_smile: true.

1 Like

Referring specifically to “The time lost on a previous task is nearly impossible to be regained in a subsequent task when the average task duration is small.” The time and variation in completing work ABCDEF is unaffected by the points where measurements are made. The duration from start of A to completion of F is the same as the sum of the durations of A, B, C, D, E, and F. If A takes longer than the nominal time to complete, then BCDEF does not somehow compensate and take less time to complete and division into smaller components does not change this.

There are other considerations that affect the decomposition of work that may have much larger significance. Is each of the divisions valuable on its own? Do smaller divisions lead to the injection of wait buffers? Do larger divisions encourage multi-tasking and loss of focus?

Use of larger work divisions does not create a recovery buffer. Use of smaller divisions does not reduce the total variability in the overall work.

1 Like

Yes I agree there are many assumptions and parameters that can have a profound impact on the end result.

My assumptions are:

  • We are discussing concepts within the context of queuing theory
  • Task arrivals have no variability
  • The service time follows a Poisson distribution (where lambda is the expected mean duration)

All other things being equal, what I wrote before holds true.

Which assumption do you disagree with?

1 Like

@bvautier, You are best positioned to understand your analysis on model. One statement, the mention of task arrivals, does give me pause.

In order to compare two different models, one must ensure that similar data sets are used in both models. I suggest the following two data points that might be used in evaluating your model.

Assume the arrival of work item ABCDEF that may also be decomposed into groups of A - F. The arrival time for A is the same in both cases and there is no arrival time delay for A - F. The queue in either case consists of A - F and the difference is when work may be moved off of the queue. The service time and variation only affects the latter and the full delivery time is the same, but the decomposed group allows A - E to be delivered sooner. This is a common decomposition pattern. An example is that an order for 30 items arrives and the order could be delivered as a single 30 item delivery or as six 5 item deliveries.

An alternative is to have 6 work items, A - F, that may also be combined into a single work item ABCDEF. In the first case, there will be arrival time delays between each item. In the combined case, the arrival delays are consolidated and A has the total arrival delay while the delay in A - F has been collapsed to zero. In the individual item case, service time variations are (at least partially) masked by the arrival delays while in the combined case, the service time variations are aggregated. This is a common pattern seen in project or stage-gated work models. An example would be consolidating six orders of 5 items to a single batch run of 30 items.

Does this help explain my reasoning?

1 Like

Yes, to a degree. I think It helps me understand your frame of reference.

I believe we can reach full agreement if we flesh out more assumptions:

  • There is a steady amount of work arriving at the system. We can call these projects.
  • These projects can wait in a queue before they are introduced into WIP (i.e. this is a high level queue which is different to the WIP queue).
  • A project is made up of many different tasks.
  • The system has a “gate keeper” or “controller” that decides when the system is allowed to start on a project. It officially starts a project by releasing at least one task into WIP.


  • While a project is waiting in the “system” queue, no tasks have been started and the customer could “pull the project” at any time. That is the risk of leaving it too long in the system queue. (Once a customer knows their project has officially started and entered into WIP they are less likely to pull the project.)

The following topics are all worth discussing:

  • How to manage customer expectations while their project is in the system queue. (So they feel progress is being made without it officially entering WIP. Do they even need to know when their project enters WIP? The system queue is just a construct anyway…)
  • How we break a project into tasks.
  • Which task(s) we release into WIP first.
  • How we group tasks together.
  • How we allocate tasks to workers.

This is not my model. I am only making an observation based on Kingman’s equation.

Kingman’s equation is also known as the VUT equation because it can be broken up into three distinct areas:

  • V: Variability (both in arrivals and servicing)
  • U: Utilisation
  • T: Mean service Time

To set the scene, let’s assume the gate keeper (GK) has a discussion with a worker that goes something like this:

  • GK: I am about to release a new project into WIP. I see you are not currently working on anything.
  • Worker: correct
  • GK: I have looked at the tasks and have broken them up so that they only take 1 unit of time each to complete.
  • Worker: OK
  • GK: I will give you a new task at every 1 unit of time interval. (This is the critical part of the discussion.)
  • Worker: OK

This discussion is loaded with assumptions…

  • How do we know the worker has the skills to do the tasks?
  • How does the GK know how to break up tasks? (by skill and duration)
  • Why are we even breaking up tasks?

From Kingman’s equation the assumption are:

  • The coefficient of variation for arrivals into the workers queue will be 0. They always arrive at exactly the same time interval. There is no variation.
  • The worker will not always finish each task within 1 unit of time. We assume this is the mean. Some tasks will be finished faster, others will take longer.
  • This means the coefficient of variation for services has a positive value.
  • Because each task is meant to take 1 unit of time and the worker takes on average one unit of time to complete a task the utilisation is close to 1.

My assumptions:

  • In order to make the calculations easy I am assuming the worker will complete the tasks according to a Poisson distribution. The mean and variance are the same so calculating the coefficient of variation for services is easy.
  • It also means we can compare different results based on the average duration of a task. Is it better to have short tasks or long tasks ? (all things being equal)

The observation is that the tasks in the workers queue will take 10 times longer to complete if the average task duration is 1 compared to 10. (i.e. from the time they start the first task to the time they complete the 10th task. The big difference is mainly due to the Poisson distribution. Smaller lambdas have a long tail.)

There can be some good discussion around:

  • Whether a Poisson distribution is appropriate (when it applies and when it doesn’t).
  • Can we really assume the coefficient of variation is 0? What happens if the GK did a bad job and released a task that is actually not equal to 1 unit of time but 2? We can’t assume the coefficient of variation is 0 anymore. Things have all of a sudden gotten worse. (What if the there were hidden complexities and the task is really worth 3 or 4 units of time and the worker does not even have the skills to finish the task?)
  • Why is the GK releasing a task at a set interval in the first place? Can’t we just wait for the worker to finish a task first before giving him a new one? (OK, we can’t use Kingman’s equation then. But it is useful to give us some insights.)
  • Compare knowledge-work to manufacturing where we have well defined:
    ** Bills of Materials
    ** Recipes (with precise manufacturing steps)
    ** Production routes
    ** Machine capabilities
    Why does this matter and could we learn anything from manufacturing?

Sorry, this is getting too long…

In summary:

  • We have 3 queues. (System, WIP and worker)
  • I am only analysing the workers queue
  • Tasks are arriving at a predefined interval
  • The real problem is not the variability of arrivals (it is 0) but the variability in completing each task.
  • In a Poison distribution a small lambda value has a long tail.
  • We haven’t even considered multitasking yet
1 Like

@bvautier, The model used will determine how equations are applied. There are two assumptions that I will revisit later:

  • The nature of the work does not change with segmentation size
  • There is a separation into a system queue and a worker queue.

There are two conditions within the model that I suggest insert bias:

  • Addition of work into the worker queue based on a time cadence
  • Only evaluating the time to completion of the tenth item.

Looking at one extreme, where all 10 items are completed in less than the nominal time, then in the small batch scenario, there is an injected wait time to start made by the imposition of the work addition cadence. The mean time to complete individual items becomes equal to the addition cycle time with zero variation without regards to the actual nature of the work. For the 10 batch version, there are no imposed wait times and mean and variation are based on the nature of the work.

Looking at the other extreme, where each work item takes longer than the nominal time, then in both the small batch and large batch scenarios, the start time of each successive item is based on the completion of the previous items. Assuming no difference in the nature of the work, then the individual item start times will be the same.

Looking at completion times,first assuming each work item takes longer than nominal,then the mean time to complete items 1-10 in the small batch scenario is approximately one-half the time in the large batch scenario. Assuming each work item takes less than nominal, then the mean time in the small batch scenario would be about 5 times the item addition cycle time, which, under reasonable assumptions, would still be less than the completion time of the large batch scenario.

Returning to the initial assumptions, one of the axioms of the small batch approach is that by providing focus, unneeded work is avoided leading to reduced mean times and variation in the individual items and in the aggregate. So, the nature of the work may change for the better due to small batch size.

The assumption of a system queue, however, may be the more significant one. This moves the constraint on work from the worker queue back to the system queue. One of the assertions from the Theory of Constraints is that optimization outside of the constraint provides no benefit. Thus, in this model, concerns with the worker queue are moot. The system focus should be on the system queue.

1 Like