Strategic Sophistication vs Adversarial Machine Learning

2 minute read

Published:

Throughout our research, a common question raised by other researchers regarding strategic sophistication has been its distinction from adversarial machine learning and reward poisoning for reinforcement learning. In this brief blog post, I aim to highlight the differences between these approaches and situate strategic sophistication within the broader academic landscape.

Before discussing adversarial machine learning, it’s important to highlight a fundamental difference between the two domains. In adversarial machine learning, the defender and adversary often operate within a competitive environment, typically modeled as a zero-sum game. In contrast, our setting can be applied to any general-sum game, encompassing both competitive and cooperative scenarios.

Adversarial machine learning literature examines the analysis of attack and defense strategies in scenarios where machine learning algorithms, acting as predictors, are confronted with adversaries. The primary objective of this field is to develop robust algorithms capable of withstanding such malicious attacks. Adversarial machine learning typically examines attacks classified by their timing. These can be categorized as causative attacks (targeting algorithms) and exploratory attacks (targeting models). Generally, causative attacks involve the adversary manipulating the training dataset. In contrast, exploratory attacks occur online, after the training phase. In the context of reinforcement learning, causative attacks are often referred to as poisoning attacks. In such cases, the adversary perturbs the rewards provided to the learning agent during offline learning. Exploratory attacks in reinforcement learning focus on manipulating the rewards and observed states received by the learning agent.

In the context of strategic sophistication in multi-agent reinforcement learning, we consider an agent with complete knowledge of the underlying game and the learning dynamics of other agents. This agent can be likened to a white-box attacker. A key distinction of our framework is that the strategic agent’s actions cannot be classified as either causative or exploratory attacks. In our setting, a naive (learning) agent and a strategic agent repeatedly engage in a game. As the naive agent does not employ any offline learning algorithm, their interaction cannot be categorized as a causative attack. Furthermore, the strategic agent cannot manipulate rewards or states due to the inherent structure of the game, making exploratory attacks infeasible. In essence, the strategic agent’s potential “attacks” are constrained by the game’s action set. While adversarial machine learning literature often leverages game-theoretic approaches, the games are typically formulated as models for analytical simplicity. In contrast, in our case, the game is predefined. This demonstrates that strategic sophistication in multi-agent reinforcement learning is a more constrained, yet more general, case of adversarial machine learning.

In conclusion, while adversarial machine learning primarily focuses on malicious attacks against machine learning systems, strategic sophistication addresses a broader range of strategic behaviors in multi-agent settings.