On Tuesday, May 27, 2008, the General Accounting Office released a new report on Supply Chain Security in the Customs-Trade Partnership Against Terrorism (C-TPAT) program. This program is one part of the SAFE Port Act, approved by the House on a 421-2 vote, by the Senate 98-0, and signed into law by President Bush in October of 2006 just before the mid-term elections. In the report, requested by Senator Susan Collins (R-Maine), one of the 80 co-sponsors of the bill, the GAO identifies significant problems with the current implementation of this program. The US Customs and Border Protection (CBP) division of the Department of Homeland Security is responsible for overseeing this program, and has “concurred with each of the recommendations.”

The issue I would like to raise has nothing to do with port security or terrorism, per se. Instead, it is: Why is it that bureaucracies botch critical things which are nevertheless considered basic by those with moderate expertise in the pertinent, science, technology or policy issue?

One specific critique in the report caught my eye:

CBP has taken steps to improve the security validation process, but still faces challenges in verifying that C-TPAT members’ security practices meet minimum criteria. CBP has sought to strengthen the validation process by providing appropriate guidance and developing a portable, electronic instrument to help ensure that validation information is consistently collected, documented, and uniformly applied to decisions regarding the awarding of benefits to C-TPAT members. However, the usefulness of the instrument is limited due to its default “no” responses. Specifically, if a response is marked “no,” it is unclear whether a security specialist, who has the discretion to answer or not answer individual questions, intentionally answered the question or if the response was an automatic default. This factor limits the ability of CBP to validate security practices at member companies.

Here’s what the report says: The CBP wanted to create a device to “help ensure that validation information is consistently collected, documented, and uniformly applied…”. The computer-based instrument they developed contains some 329 questions that might be asked of CBP’s business partners. Responses to many of these questions, according to the GAO, are optional, and so possible answers include “Yes,” “No” or “No response.”

Unless I missed it, the report doesn’t list any of the specific questions, so let’s make up an example. I might want to ask a member, “Have you terminated your relationship with any of your suppliers because of possible terrorist links?” The follow-up required for an answer of “No” would obviously be very different from the reaction to “No response.” But, the device doesn’t have an option to record “No response,” and the default reply is “No.” So, the output from the validation session cannot distinguish between these two, very different answers. According to the GAO report, this “limits the ability of CBP to validate security practices at member companies.” So, it seems to me that the entire first two years of an effort to develop a joint government-business program to better secure our ports has been seriously undermined by this failure.

On the most basic project management or computer programming level, this represents simply egregious malfeasance (many other expletive-laden descriptions could easily follow). If this were an isolated incident it probably wouldn’t bother me as much, but I don’t believe that it is. For example, a couple of months ago, the brouhaha on the breach of passport records within the State Department raised similar management/programming issues and questions.

Any reasonably competent computer programmer I know, while setting up such a device would normally have questioned a specification which called for a “yes-or-no” only response, with a default to “no”. Because designers rarely are perfect, programmers learn to question basic logic, lest they be blamed for problems that might arise. A reasonably competent project manager or designer would specify that the device be able to account for all possible responses, including “No response,” and not leave responsibility for determining such basic design factors to programmers, because few programmers are perfect either. In the commercial world, most such projects are subjected to beta-testing, where data are collected in the real world and output is evaluated, in order to find such flaws before they are released.

Despite these and many other safeguards in product development, complex computer-based projects often still contain bugs when they are released. Most bugs of any significance are found by users relatively quickly, and are corrected by programmers via updates. In this case, however, the software is not complex: It needs to ask 329 questions and save 329 responses. (I would bet that most 12-year-olds could program this correctly.) But, this electronic instrument has been in use for some time, collecting useless data, and without this flaw being corrected.

The first of the GAO’s recommendations is to

Continue to improve the consistency with which validations are conducted and documented by revising the electronic instrument used in validations to include appropriate response options and eliminate the use of default “no” responses.

In the Homeland Security’s response letter of April 8, 2008, they “concur” with the recommendation, and note that

CBP is developing a second generation automated tool which will eliminate the use of the default “no” response and will address all security criteria.

They will do this, using a “[p]hased in approach on a sector-by-sector basis over the next 12 months.” So the device was apparently built without any way to update it. And it will thus take another year to correct a problem which thwarts the entire validation process.

In my opinion, the problems here go deeper than a debate on government vs private enterprise would help to explain. I’ve seen similar fundamental (and extremely expensive) failures in large companies on a couple of occasions. I can construct reasonable explanations, step by step, for how these types of mistakes get made, particularly at the lowest levels of large organizations. But, I do not understand how it is that such systems are put into actual use without anyone (apparently) discovering that they do not work as intended, and (certainly) taking any steps to resolve their problems. Are the failures the result of lack of accountability? Are government salaries too low at some levels to attract people who can be expected to think critically? It seems to me that the cost of competent low- to middle-level project managers would be minimal compared to the cost of such fiascos. Are there other explanations? Multiple factors involved?

Obviously, the GAO has personnel who are competent to evaluate systems. But, equally obviously, it shouldn’t take a GAO audit to discover such fundamental problems. I will continue to think about this issue. Meanwhile, do any of you have ideas?