Catalogo Articoli (Spogli Riviste)

OPAC HELP

Titolo:
Multiagent reinforcement learning using function approximation
Autore:
Abul, O; Polat, F; Alhajj, R;
Indirizzi:
Middle E Tech Univ, Dept Comp Engn, TR-06531 Ankara, Turkey Middle E Tech Univ Ankara Turkey TR-06531 Engn, TR-06531 Ankara, Turkey Amer Univ Sharjah, Dept Math & Comp Sci, Sharjah, U Arab Emirates Amer Univ Sharjah Sharjah U Arab Emirates Sci, Sharjah, U Arab Emirates
Titolo Testata:
IEEE TRANSACTIONS ON SYSTEMS MAN AND CYBERNETICS PART C-APPLICATIONS AND REVIEWS
fascicolo: 4, volume: 30, anno: 2000,
pagine: 485 - 497
SICI:
1094-6977(200011)30:4<485:MRLUFA>2.0.ZU;2-A
Fonte:
ISI
Lingua:
ENG
Keywords:
adaptive behavior; multiagent learning; reinforcement learning;
Tipo documento:
Article
Natura:
Periodico
Settore Disciplinare:
Engineering, Computing & Technology
Citazioni:
38
Recensione:
Indirizzi per estratti:
Indirizzo: Abul, O Middle E Tech Univ, Dept Comp Engn, TR-06531 Ankara, Turkey MiddleE Tech Univ Ankara Turkey TR-06531 R-06531 Ankara, Turkey
Citazione:
O. Abul et al., "Multiagent reinforcement learning using function approximation", IEEE SYST C, 30(4), 2000, pp. 485-497

Abstract

Learning in a partially observable and nonstationary environment is still one of the challenging problems In the area of multiagent (MA) learning. Reinforcement learning is a generic method that suits the needs of MA learning in many aspects. This paper presents two new multiagent based domain independent coordination mechanisms for reinforcement learning; multiple agentsdo not require explicit communication among themselves to learn coordinated behavior. The first coordination mechanism Is perceptual coordination mechanism, where other agents are included in state descriptions and coordination information is Learned from state transitions. The second is observing coordination mechanism, which also includes other agents in state descriptions and additionally the rewards of nearby agents are observed from the environment. The observed rewards and agent's own reward are used to constructan optimal policy. This way, the latter mechanism tends to increase region-wide joint rewards. The selected experimented domain is adversarial food-collecting world (AFCW), which can be configured both as single and multiagent environments, Function approximation and generalization techniques are used because of the huge state space. Experimental results show the effectiveness of these mechanisms.

ASDD Area Sistemi Dipartimentali e Documentali, Università di Bologna, Catalogo delle riviste ed altri periodici
Documento generato il 24/09/20 alle ore 05:01:23