Error and Doubt in Artificial Intelligence Management

The hype around AI appears to be subsiding, but whereas previous cool downs resulted in “AI Winters”, periods of despair, and an absence of funding, today, there may be an “AI Autumn,” where expectations move away from novel breakthroughs to practical implementation. This transitory phase, where powerful AI tools are adopted by private and public sector actors, requires careful consideration of how these tools will be managed, and the organizational environment in which they are deployed. Rapid adoption of new technologies without investment in risk avoidance can amplify the magnitude of mistakes down the line. As a result, management must develop a much greater preoccupation with error.

The changes AI necessitates from management stem from how it is to be deployed in real-world contexts. In Prediction Machines: The Simple Economics of Artificial Intelligence, economists Ajay Agrawal, Joshua Gans, and Avi Goldfarb provide a schema for understanding AI’s impact on organizations. They argue that existing AI tools powered by the Deep Learning revolution, which “learn” to extract meaningful correlations from large data sets, function fundamentally as prediction technologies. From translation, where AI is meant to predict the sentence structure in the output language, to drug discovery, where it predicts interactions between new chemical compounds and human biology, to autonomous vehicles, where the AI predicts the actions of human drivers and the obstacles in the environment, prediction is at the core of this technology. As AI grows more applicable in more domains, and the cost of deployment decreases, a rapid drop in the cost of prediction will be noticed by managers. And, as economics tells us, as something gets cheaper, more of it will be done.

Increasing the prevalence of prediction increases the value of its vital complement, judgement, as prediction and judgement are the two ingredients in making decisions. When managers make strategic decisions within their organizations, they first trial through “If-Then” scenarios, predicting what would happen if a certain course of action is taken. Following this, they exercise judgement in choosing which action to pursue. As more predictions are made, more “If-Then” scenarios will be tried, and the importance of good judgement increases.

This is where the worry appears. Judgement, being the qualitative and value-laden concept that it is, is often de-emphasized both in management education and in real-world business practice. An obsession with metrics, process, and optimization under the dominant managerial paradigms of bureaucratic and scientific management make out all tasks to be ones of prediction. In doing so, the risk appears that the process by which AI cycles through scenarios will also output the uncritical decision. Prediction becomes prescription in the absence of human judgement.

Insofar as AIs are high performing, this may lead to short term returns, but presents real risks in the longer term. One of these risks is excessive automation. Unless managers appreciate the qualitative value of human judgement, they are likely to reduce jobs to their technical tasks, believing they can be automated more rapidly. In doing so, they become exposed to the risk of not only prematurely laying off workers, but failing to preform core functions. Tesla’s major investments in automation which slowed down their production of the Model 3 serves as a cautionary tale in this regard.

On a much larger level is the risk that AI predictions do not accurately capture the causality of phenomena. Extracting patterns from data provides correlations, ones that should afford a high degree of belief given the size of the data sets used in most deep learning algorithms today. But spurious correlations are still a problem, especially when the underlying data set is biased. Should the algorithm prove beneficial at smaller scales of deployment, managers may defer more and more important decisions to the AI, allowing for the possibility of large scale failure when the spurious correlation breaks down.

Avoiding these risks requires managers to care more about failure than success. The research programme of high reliability organizations (HROs), which grew out of nuclear disasters, and has had a major impact on military management, provides a guide. At the core of this programme is a preoccupation with failure, which requires embracing complexity rather than reducing it to easily quantifiable traits, an appreciation of the tacit knowledge of employees in their respective roles, resilience in the face of obstacles, and fluid structures that do not prioritize metrics over performance.

Preoccupation with error at the organizational level may be easier to incentivize if AI is also made more error-prone. A safety mechanism advocated by AI pioneer Stuart Russell, error prone AI will force humans to pay attention to the output of algorithms rather than defer to them, and also make algorithms better by not being overconfident in the patterns found in limited data sets. In experiments, error prone AIs were found to increase cooperation in human teams by improving performance while not making employees complacent. An example of this form of AI is seen in the film Interstellar, where TARS, the AI system that assists the team, has an honesty setting of 90%, as it was found to be more useful for human interaction if the system made mistakes.

The promise of AI industrialization is enormous, with the hopes that it gets developed economies out of the productivity slump they have been experiencing. It is nonetheless important that its deployment is made with an eye to the limitations of these prediction systems, and managers are preoccupied with the potential risks.

Ryan Khurana is a Catalyst Policy Fellow, Executive Director of the Institute for Advancing Prosperity, and a tech policy fellow at Young Voices.
Catalyst articles by Ryan Khurana