Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
Fourth International Conference on Hybrid Intelligent Systems (HIS'04)
…
3 pages
1 file
In this paper, we describe how certain aspects of the biological phenomena of stigmergy can be imported into multiagent reinforcement learning (MARL), with the purpose of better enabling coordination of agent actions and speeding up learning. In particular, we detail how these stigmergic aspects can be used to define an inter-agent communication framework.
Computers in Industry, 2004
This paper describes and discusses a novel design and a prototype implementation for manufacturing control systems, aimed at handling changes and disturbances. This novel design utilizes the concept of a multi-agent system. Agents in this system use an indirect coordination mechanism, called stigmergy. Stigmergy is a class of mechanisms that mediate animal-animal interactions. It consists of indirect communication that is taking place between individuals of an insect society by local modifications induced by these insects on their environment. The coordination mechanism in this paper is based on a technique used by food foraging ants. Food foraging ants provide the inspiration by the manner in which they spread information and make global information available locally; thus, an ant agent only needs to observe its local environment in order to account for nonlocal concerns in its decisions. A prototype was built to test the coordination technique. The prototype comprises a flexible manufacturing system model/emulation that has dynamic order arrival, probabilistic processing time, and some general perturbations such as machine breakdowns. The prototype served to investigate a specific research question: is it possible to create short-term forecasts based on the intentions of the agents. It has been intentionally kept simple to facilitate the understanding of what is happening in the system. Size and complexity of the prototype implementations are being augmented gradually in ongoing research.
… on Artificial Multi-Agent Learning, 2004
We survey the recent work in AI on multi-agent reinforcement learning (that is, learning in stochastic games). We then argue that, while exciting, this work is flawed. The fundamental flaw is unclarity about the problem or problems being addressed. After tracing a representative sample of the recent literature, we identify four well-defined problems in multi-agent reinforcement learning, single out the problem that in our view is most suitable for AI, and make some remarks about how we believe progress is to be made on this problem.
2021
Abmarl is a package for developing Agent-Based Simulations and training them with MultiAgent Reinforcement Learning (MARL). We provide an intuitive command line interface for engaging with the full workflow of MARL experimentation: training, visualizing, and analyzing agent behavior. We define an Agent-Based Simulation Interface and Simulation Manager, which control which agents interact with the simulation at each step. We support integration with popular reinforcement learning simulation interfaces, including gym.Env (Brockman et al., 2016) and MultiAgentEnv (Liang et al., 2018). We leverage RLlib’s framework for reinforcement learning and extend it to more easily support custom simulations, algorithms, and policies. We enable researchers to rapidly prototype MARL experiments and simulation design and lower the barrier for pre-existing projects to prototype Reinforcement Learning (RL) as a potential solution.
IEEE Transactions on Systems, Man, and Cybernetics, 2008
Multiagent systems are rapidly finding applications in a variety of domains, including robotics, distributed control, telecommunications, and economics. The complexity of many tasks arising in these domains makes them difficult to solve with preprogrammed agent behaviors. The agents must, instead, discover a solution on their own, using learning. A significant part of the research on multiagent learning concerns reinforcement learning techniques. This paper provides a comprehensive survey of multiagent reinforcement learning (MARL). A central issue in the field is the formal statement of the multiagent learning goal. Different viewpoints on this issue have led to the proposal of many different goals, among which two focal points can be distinguished: stability of the agents' learning dynamics, and adaptation to the changing behavior of the other agents. The MARL algorithms described in the literature aim-either explicitly or implicitly-at one of these two goals or at a combination of both, in a fully cooperative, fully competitive, or more general setting. A representative selection of these algorithms is discussed in detail in this paper, together with the specific issues that arise in each category. Additionally, the benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied. Finally, an outlook for the field is provided.
ArXiv, 2021
Multi-Agent reinforcement learning has received lot of attention in recent years and have applications in many different areas. Existing methods involving Centralized Training and Decentralized execution, attempts to train the agents towards learning a pattern of coordinated actions to arrive at optimal joint policy. However if some agents are stochastic to varying degrees of stochasticity, the above methods often fail to converge and provides poor coordination among agents. In this paper we show how this stochasticity of agents, which could be a result of malfunction or aging of robots, can add to the uncertainty in coordination and there contribute to unsatisfactory global coordination. In this case, the deterministic agents have to understand the behavior and limitations of the stochastic agents while arriving at optimal joint policy. Our solution, DSDF which tunes the discounted factor for the agents according to uncertainty and use the values to update the utility networks of i...
This paper presents a novel method for on-line coordination in multiagent reinforcement learning systems. In this method a reinforcement-learning agent learns to select its action estimating system dynamics in terms of both the natural reward for task achievement and the virtual reward for cooperation. The virtual reward for cooperation is ascertained dynamically by a coordinating agent who estimates it from the change in degree of cooperation of all agents using a separate reinforcement learning. This technique provides adaptive coordination, requires less communication and ensures agents to be cooperative. The validity of virtual rewards for convergence in learning is verified, and the proposed method is tested on two different simulated domains to illustrate its significance. The empirical performance of the coordinated system compared to the uncoordinated system illustrates its advantages for multiagent systems.
2004
In order to cope with today's dynamic environment, the described manufacturing control system is designed as a self-organising multi-agent system. The design of this novel system implements the PROSA reference architecture [1]. Coordination among agents is done indirectly through a pheromone-based dissipative field as is done by social insects in coordinating their behaviour. In this case, our agents act as social insects interpreting the pheromones put by the others in the environment.
Expert Systems with Applications, 2011
Learning automata (LA) were recently shown to be valuable tools for designing Multi-Agent Reinforcement Learning algorithms and are able to control the stochastic games. In this paper, the concepts of stigmergy and entropy are imported into learning automata based multi-agent systems with the purpose of providing a simple framework for interaction and coordination in multi-agent systems and speeding up the learning process. The multi-agent system considered in this paper is designed to find optimal policies in Markov games. We consider several dummy agents that walk around in the states of the environment, make local learning automaton active, and bring information so that the involved learning automaton can update their local state. The entropy of the probability vector for the learning automata of the next state is used to determine reward or penalty for the actions of learning automata. The experimental results have shown that in terms of the speed of reaching the optimal policy, the proposed algorithm has better learning performance than other learning algorithms.
Proceedings of the National Conference on Artificial Intelligence, 1998
Reinforcement learning can provide a robust and natural means for agents to learn how to coordinate their action choices in multiagent systems. We examine some of the factors that can influence the dynamics of the learning process in such a setting. We first distinguish reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts. We study (a simple form of) Q-learning in cooperative multiagent ...
The happiness advantage, 2012
International Journal of Science and Research (IJSR), 2024
Journal of Ancient Egyptian Interconnections , 2023
Selçuklu Araştırmaları Dergisi/20 (Haziran 2024), 43-58, 2024
Estudios de Asia y África, 2020
Przystanek Dolny Śląsk, 2014
Nature Communications, 2021
SILVA, Osiris M. Araújo da; HOMMA, Alfredo Kingo Oyama (Organizadores).Pan-Amazônia: visão histórica, perspectivas de integração e crescimento. Manaus, 2015, 2015
Clinica Chimica Acta, 2019
Argentine Journal of Cardiology, 2010
Textura Ulbra, 2013
Cerebral Cortex Communications, 2020
Iranian Journal of Pharmaceutical Research : IJPR, 2020