Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is it possible to certify an AI-based solution for safety-critical systems? [closed]

First, I read this. But I would like to expand. To summarize:

  1. When designing safety-critical systems a designer has to evaluate some metrics to get the confidence that the system will work as expected. It is, kind of, a mathematical proof with low enough complexity to be accessible to a human being. It has to do with accountability, reliability, auditability, etc...

  2. On the other hand, at this point, AI is a black box that seems to work very well, but most of the times we do not have a proof of its correctness (mainly because the thing going on in the box is too complex to be analyzed), it is more like a statistical certainty: We trained the system and the system performed well for all the tests.

So, some questions?

Q1. Do these two vague thoughts make sense nowadays?

Q2. Is possible to use AI for safety-critical system and be sure of its performance? Can we have certainty about the deterministic behavior of AI? Any reference?

Q3. I guess there are already some companies selling safety-critical systems based on AI in the automotive realm for example. How do they manage to certify their products for such a restrictive market?

EDIT

About Q1: thanks to Peter, I realized that, for example, for the automotive example, there are not requirements about total certainty. ASIL D level, the most restrictive level for automotive systems, requires only an upper bound for the probability of failure. So do other ISO26262 standards and levels. I would refine the question:

Q1. Is there any safety standard in system design, at any level/subcomponent, in any field/domain, that requires total certainty?

About Q2: Even though total certainty were not required, the question still holds.

About Q3: Now I understand how they would be able to achieve certification. Anyhow, any reference would be very welcome.

like image 210
Fusho Avatar asked Jun 03 '17 14:06

Fusho


1 Answers

No solution or class of technology actually gets certified for safety-critical systems. When specifying the system, hazards are identified, requirements are defined to avoid or mitigate those hazards to an appropriate level of confidence, and evidence is provided that the design and then the implementation meet those requirements. Certification is simply sign-off that, within context of the particular system, appropriate evidence has been provided to justify a claim that the risk (product of likelihood of some event occurring, and the adverse impact if that event occurs) is acceptably low. At most, a set of evidence is provided or developed for a particular product (in your case an AI engine) which will be analysed in the context of other system components (for which evidence also needs to be obtained or provided) and the means of assembling those components into a working system. It is the system that will receive certification, not the technologies used to build it. The evidence provided with a particular technology or subsystem might well be reused but it will be analysed in context of the requirements for each complete system the technology or subsystem is used in.

This is why some technologies are described as "certifiable" rather than "certified". For example, some real-time operating systems (RTOS) have versions that are delivered with a pack of evidence that can be used to support acceptance of a system they are used in. However, those operating systems are not certified for use in safety critical systems, since the evidence must be assessed in context of the overall requirements of each total system in which the RTOS is employed.

Formal proof is advocated to provide the required evidence for some types of system or subsystems. Generally speaking, formal proof approaches do not scale well (complexity of the proof grows at least as fast as complexity of the system or subsystem) so approaches other than proof are often employed. Regardless of how evidence is provided, it must be assessed in the context of requirements of the overall system being built.

Now, where would an AI fit into this? If the AI is to be used to meet requirements related to mitigating or avoiding hazards, it is necessary to provide evidence that they do so appropriately in context of the total system. If there is a failure of the AI to meet those requirements, it will be necessary for the system as a whole (or other subsystems that are affected by the failure of the AI to meet requirements) to contain or mitigate the effects, so the system as a whole meets its complete set of requirements.

If the presence of the AI prevents delivery of sufficient evidence that the system as a whole meets its requirements, then the AI cannot be employed. This is equally true whether it is technically impossible to provide such evidence, or if real-world constraints prevent delivery of that evidence in context of the system being developed (e.g. constraints on available manpower, time, and other resources affecting ability to deliver the system and provide evidence it meets its requirements).

For a sub-system with non-deterministic behaviour, such as the learning of an AI, any inability to repeatably give results over time will make it more difficult to provide required evidence. The more gaps there are in the evidence provided, the more it is necessary to provide evidence that OTHER parts of the system mitigate the identified hazards.

Generally speaking, testing on its own is considered a poor means of providing evidence. The basic reason is that testing can only establish the presence of a deficiency against requirements (if the testing results demonstrate) but cannot provide evidence of the absence of a deficiency (i.e. a system passing all its test cases does not provide evidence about anything not tested for). The difficulty is providing a justification that testing provides sufficient coverage of requirements. This introduces the main obstacle to using an AI in a system with safety-related requirements - it is necessary for work at the system level to provide evidence that requirements are met, because it will be quite expensive to provide sufficient test-based evidence with the AI.

One strategy that is used at the system level is partitioning. The interaction of the AI with other sub-systems will be significantly constrained. For example, the AI will probably not directly interact with actuators that can cause a hazard, but will instead make requests to other subsystems. Then the burden of evidence is placed on how well the other subsystems meet requirements, including the manner they interact with actuators. As part of providing that evidence, the other subsystems may check all the data or requests from the AI, and ignore any that would cause an inappropriate actuation (or any other breach of overall system requirements). As a result of this, the AI itself may not actually meet any safety-related requirements at all - it might simply take information or provide information to other subsystems, and those other subsystems actually contribute more directly to meeting the overall system requirements. Given that the developers of an AI probably cannot provide all the needed evidence, it is a fair bet that system developers will try to constrain the effects an AI - if employed - can have on behaviour of the total system.

Another strategy is to limit the learning opportunities for the AI. For example, evidence will be provided with each training set - in the context of the AI itself - that the AI behaves in a predictable manner. That evidence will need to be provided in total every time the training set is updated, and then the analysis for the system as a whole will need to be redone. That is likely to be a significant undertaking (i.e. a long and expensive process) so the AI or its training sets will probably not be updated at a particularly high rate.

like image 118
Peter Avatar answered Nov 15 '22 07:11

Peter