Multiple Choices and Reputation in Multiagent Interactions

Siang Chong; Xin Yao

doi:10.1109/TEVC.2006.890544

Multiple Choices and Reputation in Multiagent Interactions

Siang Chong, Xin Yao

Computer Science

Research output: Contribution to journal › Article

26 Citations (Scopus)

Abstract

Coevolutionary learning provides a framework for modeling more realistic iterated prisoner's dilemma (IPD) interactions and to study conditions of how and why certain behaviors (e.g., cooperation) in a complex environment can be learned through an adaptation process guided by strategic interactions. The coevolutionary learning of cooperative behaviors can be attributed to the mechanism of direct reciprocity (e.g., repeated encounters). However, for the more complex IPD game with more choices, it is unknown precisely why the mechanism of direct reciprocity is less effective in promoting the learning of cooperative behaviors. Here, our study suggests that the evolution of defection may be a result of strategies effectively having more opportunities to exploit others when there are more choices. We note that strategies are less able to resolve the intention of an intermediate choice, e.g., whether it is a signal to engender further cooperation or a subtle exploitation. A likely consequence is strategies adapting to lower cooperation plays that offer higher payoffs in the short-term view when they cannot resolve the intention of opponents. However, cooperation in complex human interactions may also involve indirect interactions rather than direct interactions only. Following this, we study the coevolutionary learning of IPD with more choices and reputation. Here, current behavioral interactions depend not only on choices made in previous moves (direct interactions), but also choices made in past interactions that are reflected by their reputation scores (indirect interactions). The coevolutionary learning of cooperative behaviors is possible in the IPD with more choices when strategies use reputation as a mechanism to estimate behaviors of future partners and to elicit mutual cooperation play right from the start of interactions. In addition, we study the impact of the accuracy of reputation estimation in reflecting strategy behaviors of different implementations and why it is important for the evolution of cooperation. We show that the accuracy is related to how memory of games from previous generations is incorporated to calculate reputation scores and how frequently reputation scores are updated.

Original language	English
Pages (from-to)	689-711
Number of pages	23
Journal	IEEE Transactions on Evolutionary Computation
Volume	11
Issue number	6
DOIs	https://doi.org/10.1109/TEVC.2006.890544
Publication status	Published - 1 Jan 2007

Keywords

coevolutionary learning
evolutionary computation
evolutionary games
reputation
iterated prisoner's dilemma (IPD)

Access to Document

10.1109/TEVC.2006.890544

Market Based Control of Complex Computational Systems
Yao, X.
Engineering & Physical Science Research Council
1/10/04 → 31/03/10
Project: Research Councils

Cite this

@article{4db382a5d8574f4483a224f7ff62bdf6,

title = "Multiple Choices and Reputation in Multiagent Interactions",

abstract = "Coevolutionary learning provides a framework for modeling more realistic iterated prisoner's dilemma (IPD) interactions and to study conditions of how and why certain behaviors (e.g., cooperation) in a complex environment can be learned through an adaptation process guided by strategic interactions. The coevolutionary learning of cooperative behaviors can be attributed to the mechanism of direct reciprocity (e.g., repeated encounters). However, for the more complex IPD game with more choices, it is unknown precisely why the mechanism of direct reciprocity is less effective in promoting the learning of cooperative behaviors. Here, our study suggests that the evolution of defection may be a result of strategies effectively having more opportunities to exploit others when there are more choices. We note that strategies are less able to resolve the intention of an intermediate choice, e.g., whether it is a signal to engender further cooperation or a subtle exploitation. A likely consequence is strategies adapting to lower cooperation plays that offer higher payoffs in the short-term view when they cannot resolve the intention of opponents. However, cooperation in complex human interactions may also involve indirect interactions rather than direct interactions only. Following this, we study the coevolutionary learning of IPD with more choices and reputation. Here, current behavioral interactions depend not only on choices made in previous moves (direct interactions), but also choices made in past interactions that are reflected by their reputation scores (indirect interactions). The coevolutionary learning of cooperative behaviors is possible in the IPD with more choices when strategies use reputation as a mechanism to estimate behaviors of future partners and to elicit mutual cooperation play right from the start of interactions. In addition, we study the impact of the accuracy of reputation estimation in reflecting strategy behaviors of different implementations and why it is important for the evolution of cooperation. We show that the accuracy is related to how memory of games from previous generations is incorporated to calculate reputation scores and how frequently reputation scores are updated.",

keywords = "coevolutionary learning, evolutionary computation, evolutionary games, reputation, iterated prisoner's dilemma (IPD)",

author = "Siang Chong and Xin Yao",

year = "2007",

month = jan,

day = "1",

doi = "10.1109/TEVC.2006.890544",

language = "English",

volume = "11",

pages = "689--711",

journal = "IEEE Transactions on Evolutionary Computation",

publisher = "Institute of Electrical and Electronics Engineers (IEEE)",

number = "6",

}

TY - JOUR

T1 - Multiple Choices and Reputation in Multiagent Interactions

AU - Chong, Siang

AU - Yao, Xin

PY - 2007/1/1

Y1 - 2007/1/1

N2 - Coevolutionary learning provides a framework for modeling more realistic iterated prisoner's dilemma (IPD) interactions and to study conditions of how and why certain behaviors (e.g., cooperation) in a complex environment can be learned through an adaptation process guided by strategic interactions. The coevolutionary learning of cooperative behaviors can be attributed to the mechanism of direct reciprocity (e.g., repeated encounters). However, for the more complex IPD game with more choices, it is unknown precisely why the mechanism of direct reciprocity is less effective in promoting the learning of cooperative behaviors. Here, our study suggests that the evolution of defection may be a result of strategies effectively having more opportunities to exploit others when there are more choices. We note that strategies are less able to resolve the intention of an intermediate choice, e.g., whether it is a signal to engender further cooperation or a subtle exploitation. A likely consequence is strategies adapting to lower cooperation plays that offer higher payoffs in the short-term view when they cannot resolve the intention of opponents. However, cooperation in complex human interactions may also involve indirect interactions rather than direct interactions only. Following this, we study the coevolutionary learning of IPD with more choices and reputation. Here, current behavioral interactions depend not only on choices made in previous moves (direct interactions), but also choices made in past interactions that are reflected by their reputation scores (indirect interactions). The coevolutionary learning of cooperative behaviors is possible in the IPD with more choices when strategies use reputation as a mechanism to estimate behaviors of future partners and to elicit mutual cooperation play right from the start of interactions. In addition, we study the impact of the accuracy of reputation estimation in reflecting strategy behaviors of different implementations and why it is important for the evolution of cooperation. We show that the accuracy is related to how memory of games from previous generations is incorporated to calculate reputation scores and how frequently reputation scores are updated.

AB - Coevolutionary learning provides a framework for modeling more realistic iterated prisoner's dilemma (IPD) interactions and to study conditions of how and why certain behaviors (e.g., cooperation) in a complex environment can be learned through an adaptation process guided by strategic interactions. The coevolutionary learning of cooperative behaviors can be attributed to the mechanism of direct reciprocity (e.g., repeated encounters). However, for the more complex IPD game with more choices, it is unknown precisely why the mechanism of direct reciprocity is less effective in promoting the learning of cooperative behaviors. Here, our study suggests that the evolution of defection may be a result of strategies effectively having more opportunities to exploit others when there are more choices. We note that strategies are less able to resolve the intention of an intermediate choice, e.g., whether it is a signal to engender further cooperation or a subtle exploitation. A likely consequence is strategies adapting to lower cooperation plays that offer higher payoffs in the short-term view when they cannot resolve the intention of opponents. However, cooperation in complex human interactions may also involve indirect interactions rather than direct interactions only. Following this, we study the coevolutionary learning of IPD with more choices and reputation. Here, current behavioral interactions depend not only on choices made in previous moves (direct interactions), but also choices made in past interactions that are reflected by their reputation scores (indirect interactions). The coevolutionary learning of cooperative behaviors is possible in the IPD with more choices when strategies use reputation as a mechanism to estimate behaviors of future partners and to elicit mutual cooperation play right from the start of interactions. In addition, we study the impact of the accuracy of reputation estimation in reflecting strategy behaviors of different implementations and why it is important for the evolution of cooperation. We show that the accuracy is related to how memory of games from previous generations is incorporated to calculate reputation scores and how frequently reputation scores are updated.

KW - coevolutionary learning

KW - evolutionary computation

KW - evolutionary games

KW - reputation

KW - iterated prisoner's dilemma (IPD)

U2 - 10.1109/TEVC.2006.890544

DO - 10.1109/TEVC.2006.890544

M3 - Article

VL - 11

SP - 689

EP - 711

JO - IEEE Transactions on Evolutionary Computation

JF - IEEE Transactions on Evolutionary Computation

IS - 6

ER -

Multiple Choices and Reputation in Multiagent Interactions

Abstract

Keywords

Access to Document

Fingerprint

Projects

Market Based Control of Complex Computational Systems

Cite this