Thoughts from our CTO: Establishing Standards for AI Models Will Drive Innovation & Adoption

If industry standards add £8.2bn of GDP annually, is a lack of standards for artificial intelligence (AI) holding back the technology?

Despite AI having existed as a concept for 70 years, a commercial product for some 20 years and GPTs (generative pre-trained transformers) becoming ubiquitous from 2023, it was only on 16th January 2024 that the first globally accepted standard relating to AI was published, courtesy of the British Standards Institute (BSI). BS ISO/IEC 42001 defines a certifiable AI management system framework within which AI products can be developed as part of an AI assurance ecosystem.

What impact can we expect such standards to have on the evolution of AI?

To answer that question this article will review the historic impact that standards have made on factors such as innovation and adoption to gauge whether AI is likely to benefit from or be hindered by the emergence of standards.

UK consumer law says that "products should only be sold if their compliance with product safety regulations has been demonstrated appropriately". Further, "the General Product Safety Regulations 2005 (GPSR) require all products to be safe in their normal or reasonably foreseeable usage and enforcement authorities have powers to take appropriate action when this obligation isn’t met.". (Source: https://www.gov.uk/guidance/product-safety-advice-for-businesses)

It is clear that the intent is to protect consumers from low quality, even dangerous products. Compliance with these product standards is compulsory, with the regulator able to issue fines where those standards are not met. However, as we will see, product standards do not require regulatory backing to be beneficial, nor are they only of benefit to consumers.

In 1903 the BSI published its first report, BS1, titled Rolled Steel Sections for Structural Purposes. This standard was developed to promote interoperability, allowing engineers and suppliers to communicate accurately about what was required and how it would be measured. Engineers could invite multiple suppliers to compete for a supply contract that would meet this standard and have confidence that the steel section supplied would fit thus promoting efficiency and innovation in several ways:

suppliers knew that by meeting the BS1 standard they could access a pre-existing market for steel section.
competition amongst suppliers would encourage them to improve their processes in order to supply at better prices or to a higher standard.
de-risking, accelerating and reducing the cost of steel section enables engineers to push their own boundaries and increase demand.

Fast forward almost 50 years to 1950 and the concept of AI begins to emerge.

Alan Turing, a British mathematician, defined a logical framework for developing and testing intelligent machines. However, it wasn't until 1955 that the first proof of concept was developed by Allen Newell, Cliff Shaw and Herbet Simon. The cost of compute power was a significant impediment to progress and it was only in the 1980's that significant investment was made, with the Japanese government investing $400M between 1982-1990. Despite this investment the Japanese government did not realise the anticipated success.

In 1997 IBMs Deep Blue defeated chess world champion, Gary Kasparov, and Dragon Systems released speech recognition software. From that point forward, as compute power and storage costs fell, the limits of AI possibilities have been continuously stretched and applied in areas such as advertising, natural language processing, image classification, biometrics and more recently generative AI such as ChatGPT.

Despite this rapid advance in AI capability over the last 25 years, and despite much being written in recent times on the topic of AI regulation, there were no globally agreed standards for AI models and their related systems to adhere to.

The EU have tabled the EU AI Act and the US too are defining their own regulatory framework, with key concerns centring around issues such as accuracy, fairness, privacy and transparency, but there are no globally accepted standards to which those concerns can be anchored.

AI technology has proven many times over that it can solve a broad spectrum of real world problems, often achieving greater accuracy and efficiency than a human counterpart. With Moore's law having held remarkably well, the cost of compute and storage has not only facilitated the development of effective AI models, but also the adoption of those AI models as corporate cost benefit models find that AI adoption is now a value adding exercise and not just a vanity project.

But as things stand, AI developers do not have a library of established standards that they can adhere to and reference to assure quality and build trust in their solutions. Company execs cannot reference standards when procuring AI systems, leaving them exposed to a variety of risks. This lack of standards creates space for misinformation to be easily propagated. On the one hand this space allows inferior products to grab market share by being cheaper than superior quality competitors. On the other hand it means fear driven, exaggerated and misleading commentary can spread quickly, tarnishing in aggregate the reputation of fundamentally good technologies.

Facial biometrics is a particularly interesting case study in this regard. Such technologies power applications such as facial recognition and age estimation. These technologies have been improving and advancing rapidly in recent years, yet despite this progress, there are still no universally agreed upon standards in relation to accuracy and fairness.

This lack of standards presents a risk to adoption and innovation.

We have the technology and the scientists ready to refine AI models and develop new ones, but an AI model by itself is only one piece of the jigsaw. The AI must be integrated into a wider system and that system must be deployed to a particular use case for value to be created, regardless of whether that value comes from pounds in the bank, lives saved or efficiency gained.

Imagine being the chief exec of a large corporation, or a hospital, or a school. You have seen how AI can benefit your organisation, you also know that AI is not perfect - it cannot be and was never expected to be. In the absence of standards how do you manage the risk that results from imperfection? What are the worst outcomes you could reasonably expect due to those imperfections? How do you determine which AI tech is appropriate and safe for the use case you have in mind?

US based Rite Aid, a chain of pharmacies, attempted to identify likely shoplifters using AI based facial recognition technology between 2012-2020. In the absence of applicable standards, Rite Aid appears to have procured a facial recognition system that expressly "makes no representations or warranties as to the accuracy and reliability of the product in the performance of its facial recognition capabilities".

The Federal Trade Commission, FTC, a US government agency, alleges:

Rite Aid failed to consider the risks that false positives had on consumers, including risks of misidentification based on race or gender.
Rite Aid failed to test the system for accuracy
Rite Aid failed to enforce image quality controls
Rite Aid failed to monitor, test, or track the accuracy of results

The FTC alleges, among other things, that these failings injured consumers due to humiliation and being removed from stores without being allowed to collect prescription medications. The FTC's proposed settlement includes an order that Rite Aid should be banned from operating facial recognition technology for five years. (Source: https://www.ftc.gov/business-guidance/blog/2023/12/coming-face-face-rite-aids-allegedly-unfair-use-facial-recognition-technology)

The emergence of such cases makes it even more difficult for companies to adopt these new technologies in the absence of standards. They demonstrate that claims can and will be brought if a company is found to use technology that is not "good enough", where the definition is of "good enough" is not clearly defined and therefore open to the courts to decide.

The apprehension to deploy these technologies ultimately results in slower adoption and less money flowing through to the R&D of these technologies, and that puts the brakes on innovation.

In 2022 the Department for Business, Energy & Industrial Strategy published a report on The Role of Standardisation in Support of Emerging Technologies in the UK. The report examines the potential for positive and negative side effects due to standardisation and events observed since the report was authored add to the respective arguments. On the beneficial impacts of standardisation the report says:

"Defining minimum levels of quality – that can help to create trust among early adopters of emerging technologies and avoid incidents that undermine trust in new products, as well as avoid Gresham’s Law (where low-quality products can drive out high quality goods in markets where there are high levels of information asymmetry)" (Source: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1080614/role-of-standardisation-in-support-of-emerging-technologies-uk.pdf)

The FTC case against Rite Aid will almost certainly have a negative impact on trust in facial biometric technologies, and minimum quality levels could have prevented that. It is also quite easy to see how low-quality products could drive out high quality products in the facial biometric space. One of the most challenging and expensive aspects of delivering facial biometric systems that work accurately and fairly is curating accurate, diverse, classified datasets at scale.

Facial biometric products with less diverse and less accurate datasets could be offered to the market at a price that superior competitors would not find viable. In the absence of standards that would facilitate product comparisons, a CEO procuring such a system would have no easy way to compare the relative performance of the alternate offerings making it more difficult to justify paying a higher price. As per Akerlof's Lemons phenomenon, higher quality AI products can be pushed out of the market by lower quality products due to the seller/buyer asymmetry of information. In the extreme case, if low quality products become prevalent, performance will naturally be substandard and eventually the market will fail because public trust in the technology would be lost.

The view that standards create value has empirical support. In 2015 the Centre for Economics and Business Research (CEBR) compiled a report on behalf of the BSI titled The Economic Contribution of Standards to the UK Economy. The report states that:

"On our calculations, standards appear to have contributed towards 37.4% of annual productivity growth. As an illustration, in 2013, that would translate to an extra £8.2 billion of GDP emanating from the proper use of standards." (Source: https://www.bsigroup.com/siteassets/pdf/en/about-us/bsi-the-economic-contribution-of-standards-to-the-uk-economy-uk-en.pdf)

Of particular relevance in light of the Rite Aid findings, the report cites that "In terms of the effect of standards on quality, 70% of respondents stated that standards had contributed to improving their supply chain by improving the quality of supplier products and services".

As AI regulation gathers pace it is important to consider that while standards and regulation are complimentary, they are very different tools. A particular market, say the UK, EU or US, may create regulatory requirements for companies to adhere to certain standards. However, even in the absence of a regulator, voluntarily adhering to standards still has the benefit of establishing trust, reducing risk and encouraging adoption.

The BSI report examines some of the mechanisms through which standards improve the competitiveness of business and they find that "The most important mechanism is the contribution that standardization has for enhancing the status of firms, which was cited by 84% of respondents. Standards can contribute to businesses’ competitive edge by demonstrating to the market that their products and services are of a high quality. This mechanism was even more significant to large firms with 92% reporting this was a factor, relative to 83% of SMEs."

In terms of the impact standards can have on innovation, we need look no further than Tim Berners-Lee, who invented the world wide web (WWW) in 1989, opened it to the public in 1991 and founded the W3C (World Wide Web Consortium) in 1994. It is the W3C that develops the standards on which the web was built. While not driven by regulators, these standards that were defined early on in the evolution of the web allowed companies and individuals all around the globe to create interoperable web pages, then web applications, then smart devices, and on and on, allowing multiple, huge industries to spring up that would change the face of modern economies.

Yet, it was not until 2010 when the UK's Equality Act came into force, preventing web site owners from discriminating on the basis of disability, that the BSI published BS8878, which built on the W3Cs Web Accessibility Initiative and defined how to adopt a web accessibility strategy within an organisation. Not only did this standard improve online experiences for disabled groups such as the blind, but across the board people enjoyed better online experiences as routine tasks such as tabbing through web forms and reading text became standardised and easier for all to the point where now, even complex applications like spreadsheets and accounting software are routinely deployed via web based interfaces.

With standards in place, rapid innovation and billions of people globally using the internet to go about their day to day lives, the regulators begin to enforce these standards to represent those people who are at risk of being left behind by the advancements. At this point the standards shift from voluntary codes of practice to legal requirements. Those who have been at the forefront of defining and adopting these standards observe their non-compliant competitors attract regulatory scrutiny.

In the UK for example, the Department for Work and Pensions was ordered to pay £7,000 damages for discriminating against registered blind and sight impaired people by failing to communicate in an accessible manner. Similarly, the Student Loans Company were ordered to pay £5,000 damages for failing to provide a loan form accessible to blind people. In the US, Disney, Netflix and Target have been the subject of class action lawsuits with some costs running into millions of dollars.

As the drafting of AI regulatory guidelines accelerates across Europe, the United States and globally, it is worth scrutinising the maturity of relevant standards for AI. As demonstrated by the rapid evolution of the WWW, early development of robust and globally accepted yet voluntary standards was enormously powerful. Those standards informed the regulatory evolution that would emerge many years later.

With AI, this process risks being executed in reverse, with the standards landscape still barren but regulatory development accelerating. This presents a risk of bad regulation being drafted that makes bad assumptions because there are no established, tested and refined standards to inform those assumptions.

The challenge for AI standards development is often going to come down to data. AI models require huge amounts of data to train and test. For many providers this can be a large enough challenge in its own right, but the challenge becomes more difficult still when attempting to independently verify an AI model against those standards using unseen data. This requires a further dataset, held by an independent third party, of appropriate size, accuracy and diversity meaning that this type of independent conformity test is not cheap.

Using history as our guide, AI innovation and adoption should be maximised by promoting investment in AI standards development and high quality dataset curation. What is reassuring is that the EU AI Act, while arriving before those standards, seems to agree "Standardisation should play a key role to provide technical solutions to providers to ensure compliance with this Regulation, in line with the state of the art, to promote innovation as well as competitiveness and growth in the single market".