Anti-adversarial machine learning defenses start to take root

0
81


Much of the anti-adversarial research has been on the potential for minute, largely undetectable alterations to images (researchers generally refer to these as “noise perturbations”) that cause AI’s machine learning (ML) algorithms to misidentify or misclassify the images. Adversarial tampering can be extremely subtle and hard to detect, even all the way down to pixel-level subliminals. If an attacker can introduce nearly invisible alterations to image, video, speech, or other data for the purpose of fooling AI-powered classification tools, it will be difficult to trust this otherwise sophisticated technology to do its job effectively.

Growing threat to deployed AI apps

This is no idle threat. Eliciting false algorithmic inferences can cause an AI-based app to make incorrect decisions, such as when a self-driving vehicle misreads a traffic sign and then turns the wrong way or, in a worst-case scenario, crashes into a building, vehicle, or pedestrian. Though the research literature focuses on simulated adversarial ML attacks that were conducted in controlled laboratory environments, general knowledge that these attack vectors are available will almost certainly cause terrorists, criminals, or mischievous parties to exploit them.

Although high-profile adversarial attacks did not appear to impact the ML that powered this year’s U.S. presidential campaign, we cannot deny the potential for these in future electoral cycles. Throughout this pandemic-wracked year, adversarial attacks on ML platforms have continued to intensify in other sectors of our lives.

This year, the National Vulnerability Database (part of the U.S. National Institute for Science and Technology) issued its  for an ML component in a commercial system. Also, the Software Engineering Institute’s CERT Coordination Center issued its  flagging the extent to which many operational ML systems are vulnerable to arbitrary misclassification attacks.

Late last year, that during the next two years 30 percent of all cyberattacks on AI apps would use adversarial tactics. Sadly, it would be premature to say that anti-adversarial best practices are taking hold within the AI community. A by Microsoft found that few industry practitioners are taking the threat of adversarial machine learning seriously at this point or using tools that can mitigate the risks of such attacks.

Even if it were possible to identify adversarial attacks in progress, targeted organizations would find it challenging to respond to these assaults in all their dizzying diversity. And there’s no saying whether ad-hoc responses to new threats will coalesce into a pre-emptive anti-adversarial AI “hardening” strategy anytime soon.

. This is an open, extensible framework structured like MITRE’s widely adopted that helps security analysts classify the most common adversarial tactics that have been used to disrupt and deceive ML systems.

and the  attack can be analyzed using this framework.

As discussed in the framework, there are four principal adversarial tactics for compromising ML apps.

Functional extraction involves unauthorized recovery of a functionally equivalent ML model by iteratively querying the model with arbitrary inputs. The attacker can infer and generate a high-fidelity offline copy of the model to guide further attacks to the deployed production ML model.

Model evasion occurs when attackers iteratively introduce arbitrary inputs, such as subtle pixel-level changes to images. The changes are practically undetectable to human senses but cause vulnerable ML models to classify the images or other doctored content incorrectly.

that includes critical countermeasures.

Secure coding practices would reduce exploitable adversarial vulnerabilities in ML programs and enable other engineers to audit source code. In addition, security-compliance code examples in popular ML frameworks would contribute to the spread of adversarially hardened ML apps. So far TensorFlow is the only ML framework that provides consolidated guidance around traditional software attacks and links to tools for testing against adversarial attacks. The framework’s authors recommend exploring whether containerizing ML apps can help to quarantine uncompromised ML systems from the impact of adversarially impacted ML systems.

Code analysis tools help detect potential adversarial weaknesses in ML apps as coded or when the apps execute particular code paths. ML tools such as cleverhans, secml, and support varying degrees of static and dynamic ML code testing. The Adversarial ML Threat Matrix’s publishers call for such tools to be integrated with full-featured ML development toolkits to support fine-grained code assessment before ML apps are committed to the code repository. They also recommend integration of dynamic code-analysis tools for adversarial ML into CI/CD pipelines. This latter recommendation would support automation of adversarial ML testing in production ML apps.

System auditing and logging tools support runtime detection of adversarial and other anomalous processes being executed on ML systems. The matrix’s publishers call for ML platforms to use these tools to monitor, at the very least, for attacks listed in the curated repository. This would enable tracing adversarial attacks back to their sources and exporting anomalous event logs to security incident and event management systems. They propose that detection methods be written into a format that facilitates easy sharing among security analysts. They also recommend that the adversarial ML research community register adversarial vulnerabilities in a trackable system like the National Vulnerability Database in order to alert impacted vendors, users, and other stakeholders.

A growing knowledgebase

The new anti-adversarial framework’s authors provide access through their to what they call a “curated repository of attacks.” Every attack documented in this searchable resource has a description of the adversarial technique, the type of advanced persistent threat that has been observed to use the tactic, recommendations for detecting it, and references to publications that provide further insight.

As they become aware of new adversarial ML attack vectors, AI and security professionals should register those in this repository. This way the initiative can keep pace with the growing range of threats to the integrity, security, and reliability of deployed ML apps.

Going forward, AI application developers and security analysts should also:

  • Assume the possibility of adversarial attacks on all in-production ML applications.
  • Perform adversarial threat assessments prior to writing or deploying vulnerable code.
  • Generate adversarial examples as a standard risk-mitigation activity in the AI training pipeline.
  • Test AI apps against a wide range of adversarial inputs to determine the robustness of their inferences.
  • Reuse adversarial-defense knowledge, such as that provided by the new Adversarial ML Threat Matrix, to improve AI resilience against bogus input examples.
  • Update ongoing adversarial attack defenses throughout the lifecycle of deployed AI models.
  • Ensure data scientists have sophisticated anti-adversarial methodologies to guide them in applying these practices throughout the AI development and operationalization lifecycle.

For further information on the new Adversarial ML Threat Matrix, check out the initiative’s GitHub repository, , and the Carnegie Mellon SEI/CERT’s . Other useful resources for security analysts to develop their own anti-adversarial strategies include Microsoft’s , , and the to systematically triage attacks on ML systems.

Copyright © 2020 IDG Communications, Inc.

LEAVE A REPLY