We recognize AI governance can be overwhelming – we’re here to help. Contact us today to discuss how we can help you solve your challenges and Get AI Governance Done.
AI Risk · Security
Model Evasion Attack
Model Evasion attacks manipulate inputs to get desired outputs from a model.
📋 Description
In Model Evasion attacks, malicious agents craft adversarial data to circumvent a system. For example, spam emails or malware can be designed to look inconspicuous and not trigger a negative label. The adversarial examples are derived programmatically by finding the decision boundary between classes (e.g., adding a large number of "not spam-like" words) or by adding in "noise" that will confuse the model.
These attacks often happen when an actor can test the system many times and figure out what inputs work. Stopping them means limiting access, checking for strange behavior, and testing how easily the model can be fooled.