Academia.eduAcademia.edu

Anatomy of a Task: Towards a Tentative Taxonomy of the Mind

2017

A standard model of mind will involve not only an architecture but also a set of capabilities. Ideally, the two should inform one another at a deep level, as an architecture is what both enables and constrains capabilities. In that spirit, we consider in some detail a routine and (deceptively) simple robotic task. From it, we build out a substantial list of capabilities that appear essential for a general-purpose execution of the task. We argue that this type of exercise is an indispensable step toward the establishment of a baseline for the comparison of cognitive architectures, and that the resulting taxonomy can inform the synthesis of a standard model of the mind.

A Standard Model of Mind: AAAI Technical Report FS-17-05 Anatomy of a Task: Towards a Tentative Taxonomy of the Mind David Sekora,1 Samuel Barham,1 Justin Brody,2 Don Perlis1 1 Department of Computer Science, University of Maryland - College Park 2 Department of Computer Science, Goucher College Abstract tem that could successfully navigate a scenario like the one we here lay out; we don’t believe this to be a coincidence. A standard model of mind will involve not only an architecture but also a set of capabilities. Ideally, the two should inform one another at a deep level, as an architecture is what both enables and constrains capabilities. In that spirit, we consider in some detail a routine and (deceptively) simple robotic task. From it, we build out a substantial list of capabilities that appear essential for a general-purpose execution of the task. We argue that this type of exercise is an indispensable step toward the establishment of a baseline for the comparison of cognitive architectures, and that the resulting taxonomy can inform the synthesis of a standard model of the mind. Robbie — A Day in the Life This story is inspired by a similar story featured in (Perlis et al. 2013), but here we flesh it out considerably. We use a bracket notation to show where items from our list of desiderata (presented in a later section) appear in the story. For example, the marker [1a] would mean that the preceding sentence features perception of the outside world. Robbie the robot is in trouble. She has been tasked with retrieving a book for us from room 128, but everything has gone wrong! Introduction Already, we’re hinting at one of our major themes — in the real world, things often go wrong, and an intelligent agent will need to be able to cope. This functionality is commonly referred to as perturbation tolerance. The idea that the cognitive sciences might have developed sufficiently, at long last, to justifiably produce a unified theory of cognition — i.e., one that describes a “single set of mechanisms for all of cognitive behavior” — is not new at all; it can be traced back at least to (Newell 1990). The notion, on the other hand, that there might be consensus enough among cognitive scientists to begin to compile a standard model of mind is far more surprising, and it is just this potential for consensus that forms the basis for the forthcoming (Laird, Lebiere, and Rosenbloom 2017). We jump into the discussion with our own ideas about what sorts of abilities a standard model of mind must subsume in order to be truly general and human-like, as well as how one might use this as a framework to assess cognitive architectures. We focus primarily on a story about a robot and a task she has been given — a task that turns out to be more complicated than it at first appears. The story is annotated with the capabilities an intelligent agent would need to accomplish the feats therein. After the story, we collect these capabilities into a unified list, which we believe can serve as the basis for a general list of desiderata for intelligent agents (and hence their cognitive architectures). Finally, we briefly discuss how one might evaluate existing and future cognitive architectures in light of our list, using our own ALMA system as an example. Currently, we are unaware of any system that possesses all of these capabilities to any advanced degree, nor of any sys- We made our request at 11:30AM, asking that Robbie bring us the book by noon. Robbie ran a simulation[3a] and determined that, given the distance she had to cover to reach room 128 and the motions involved in picking up and carrying a book, this was well within her capabilities[4e]. So, she planned out a sufficient route and set off[1e,3b]. Once her initial planning for the current task was done, she allocated the necessary resources to navigate and watch for anomalies, opting to use the rest of her processing power to continue working on a math problem we interrupted her in the middle of solving[3h]. Here, we see that Robbie has some interesting cognitive features. She can conduct physics simulations and make multi-step plans. Implicitly, she must have been able to break up a complex task into sub-parts (“retrieve the book” becomes “go to the book,” “pick up the book,” and “return here while still holding the book”). She has an understanding of her own capabilities and limitations. Subtly, she stops at a “sufficient route” to complete the task within the time limit, rather than continuing to search until she has found the “optimal route” — this requires her to understand that the actions of reasoning and planning themselves take time! Finally, the last sentence makes it clear that Robbie is more than just an object retrieval bot. As a persistent, gen- c 2017 Copyright  455 eral agent, she has more to do and think about than just an immediate command. mystery room is in fact room 128. As she approaches the room once more, she reflects on how it might be that she saw 123 instead of 128[3d]. She does some quick simulations[3a] and realizes that, if you rub away the left part of an 8, it looks like a 3. She makes note of this fact for future room-identification scenarios[2g]. The trouble starts about halfway through the journey. One of the hallways Robbie had planned to use was closed for maintenance! Robbie realizes that she will not be able to complete her original plan, and computes a new, slightly longer route using a different hallway[3b]. She determines that she will still be able to finish in time, and so sets off again. An intelligent agent needs to be able to identify new goals and make plans on the fly. Time continues to be an important consideration — as soon as a viable plan that doesn’t take too much time is found, Robbie springs into action. Once the plan is executed, simulation capabilities are once again highlighted, along with some nominal curiosity and learning ability. Now Robbie has run into her first obstacle. She could hardly be called an “intelligent agent” if she broke down or quit when the first problem came up. She has the ability to learn new information, incorporate it into her database, identify that this will prevent her from completing her current plan, and make a new plan, all while keeping her original goal in mind. Robbie still isn’t 100% sure that the room is room 128, but now she has much more evidence for it, and time is of the essence[4a]. So, she pushes on the door — but it won’t budge! She consults her knowledge base, but — alas — to no avail: this door has a handle on it, and Robbie hasn’t learned anything about handles yet[4e]! Robbie finally reaches room 128 — or, to be precise, the room marked on her internal map as room 128[5c]. The problem? Before opening the door, she notices that the room number says 123, not 128[1a,4d]! Here, a contradiction arises: the room in front of her appears to be both room 123 and 128, but she knows that a room in this building can only have one number. A lesser agent might give up, but luckily Robbie has strategies for dealing with apparent contradictions. “General purpose cognitive agent” does not, of course, mean “omniscient robot.” There will be many times when a cognitive agent (no matter how general) simply does not understand its environment, or its interactions with it. In such cases, the agent must have other methods of coping. Robbie must be able to ask for advice and learn from it. So Robbie does what anyone in over their head should do — she asks for help[3c]! Robbie phones her researcher and briefly explains the situation: For an intelligent agent to deal with problems, the agent must first be able to detect those problems. This particular example requires some degree of visual processing, but any such problem will at a minimum require the agent to have a set of expectations, a means of checking those expectations against reality, and strategies for resolving any conflicts. ROBBIE: I am requesting assistance. I attempted to open the door using my strategy of “push on the surface of the door,” but this did not work. I am sending you a picture of the door[1b,1c]. RESEARCHER: Ah, yes. I see what’s happened — this type of door has a handle, so you’re going to have to use your arm to turn it before you can open the door. Here, I’ll send you a video of a person opening a door. Hold on just a second. When you get it, watch it and try to learn from it[1d,2g]. Robbie reasons that either the room in front of her is room 123, the room in front of her is room 128, or one of her base beliefs about how room numbers work is flawed[2h]. Robbie has a record of her past reasoning[2j], and so she knows that she has only limited evidence for the room being either 123 or 128. Therefore, she decides to trust her base beliefs for the moment, and directs her efforts towards obtaining more evidence[3b] for the two more likely possibilities. While it may be possible that communication skills are not strictly necessary for an agent to qualify as “intelligent,” the only intelligent agents we currently know about (humans) have sophisticated forms of communication, and the two concepts certainly give the appearance of being inextricably linked. In any case, for an intelligent agent to be able to competently interact with humans, it will need at least a rudimentary understanding of language — both how to interpret it, and how to produce it. The process by which Robbie learns how to open the door may need to be a bit more involved than simply being told or watching a video. That would be ideal, since humans are able to do these types of advice-based and single-example learning quite well, but the current trend in AI research is more in the realm of training on hundreds or thousands of examples (which is of course also very important — humans also make good use of this type of learning). At any rate, we aren’t particularly concerned with the specifics here — the Here we see that Robbie is able to identify a plausible set of beliefs that could be mistaken, and furthermore able to leverage her memory to estimate how reliable each belief is. Once she narrows it down to two suspicious beliefs, she applies a common contradiction resolution strategy: gather more data! After some thought, Robbie finds a potentially relevant fact in her knowledge base: “Room numbers appear in sequence.”[2f]. This looks promising; there may be an even better method for resolving this situation buried deep in her knowledge base somewhere, but she has a strict time constraint and so cannot afford to be picky[4a]. She quickly makes and executes subplans to read the room numbers of the adjacent rooms, and finds that the rooms on either side are numbered 126 and 130. This evidence supports the hypothesis that the 456 the list items into rough categories, although the category headings are meant more as guidelines than rigid laws. important thing is that an intelligent agent must have some mechanisms in place for learning new facts and skills. Robbie successfully opens the door, and files this newfound skill away for future use[2g]. She scans the interior, and finds to her dismay (or whatever it is that robots feel when their expectations aren’t met) that it’s a mess! The floor is strewn with hundreds of books; perhaps there was an earthquake, or maybe it was just those crazy graduate students again. She does some quick math to figure out how long it will take her to sort through the mess to find the requested book[3a], and determines it to be highly unlikely that she should find it in time[4a]. Robbie realizes things are not going well, and so phones her researcher once more for advice[3c,4f]. This time, he agrees that it’s likely impossible for her to get the book in time, and tells her to call the whole thing off. Robbie obliges, and heads back to the lab. So, without further ado, we believe it safe to say that an intelligent agent should be able to: 1. Interaction: (a) (b) (c) (d) (e) Perceive the outside world; Bring attention to spatial entities (e.g. pointing); Generate simple language; Understand simple language; Move self and other objects; 2. Knowledge and Learning (a) Identify objects; (b) Keep track of real-valued quantities (such as counts); (c) Learn new objects and how they behave (individuals and classes); (d) Deliberately affect its perceptions (e.g. move to get a better viewing angle); (e) Track own actions and processing in real time (e.g. efference copy, as in (Brody, Perlis, and Shamwell 2015)); (f) Maintain a knowledge base (KB); (g) Update the KB with new information (“learning”); (h) Make inferences based on the KB; (i) Maintain information about others’ knowledge; (j) Keep a detailed history of own activity and perception; A perhaps underrated strategy for dealing with problems is to know when to give up. This will involve at least two abilities. The first is the ability to dynamically prioritize live tasks on the basis of a cost-benefit analysis; the second, implied by the first, is the ability to predict one’s probability of success (or lack thereof) in a given task-instance. Back in the lab, Robbie doesn’t just shut off, even though she no longer has a task. Instead, she returns to working on that math problem[3h]. A persistent agent must be, well, persistent. This is for the AI researcher likely a satisfying ending to the story. However, consider briefly an alternative ending, which illustrates yet another important cognitive ability — namely, the ability to imagine novel goals, or to infer goals that other agents are likely to possess; and, in response, to invent, prioritize, and initiate novel tasks. Watch: 3. Goals, Planning, and Acting (a) Simulate behavior in imagination (for use in vision/planning); (b) Make and execute plans to achieve goals, including backup plans where appropriate; (c) Ask for help effectively (knowing whom and how to ask); (d) Identify new goals, including ones for future or longterm benefit; (e) Identify needs of others; (f) Be helpful (as appropriate); (g) Keep track of priorities and rearrange them as necessary; (h) Seek knowledge as a general goal, when consistent with other goals; (i) Identify overly complex plans, and have strategies for dealing with them (prune, get help, give up); [...] Robbie obliges, and starts to head back to the lab. However, as she approaches the door, a subprocess realizes that it would be helpful in the future for the room to be in order, should she or someone else be tasked with finding a book again[3d,3f]. She infers that this goal also seems desirable to her researcher[3e], and so assigns herself the task of organizing the books in the room, marking it as the active goal but noting that it can be interrupted if she is requested elsewhere. The only other source of intrigue at the moment is that math problem from before, but it has relatively lower priority[3g]. So, she gets to work cleaning the room with all the alacrity characteristic of her programming. Intelligent agents need to be able to plan for long-term benefit. Plus, a robot might as well make itself useful! 4. Real-World Considerations (a) Control activities (including inference) to respect realtime constraints; (b) Forget and relearn when necessary; (c) Possess and apply contextual awareness; (d) Detect anomalies in the world and in reasoning, and have strategies for dealing with them; (e) Know its own capabilities and shortcomings; List of Desiderata for Intelligent Agents We now present a list of capabilities, primarily derived from the story, which a general-purpose intelligent agent might reasonably be expected to possess. We do not claim this to be a complete list, only that a “smart” robot would likely need most of these. For convenience, we have grouped 457 (f) “Take stock”: how are things going overall in the short/medium/long term?; 5. Special Category Distinctions (a) Distinguish self from other; (b) Distinguish parts from wholes; (c) Distinguish appearance and thought from reality. formalism. Active logic is a form of first-order logic derived from step-logic (Elgot-Drapkin, Miller, and Perlis 1991), which is built to accomodate reasoning situated in time. 1. Interaction ALMA is a reasoning system, not a full embodied agent, so many of the items in this section don’t really apply. However, systems have been built (e.g. (Josyula, Anderson, and Perlis 2004)), and are being built in our current work, that incorporate ALMA as a core component which are capable of all of these. If you read the story carefully, you may notice that not all of these capabilities are directly referenced. For the most part, these are additional capabilities such as [4c] “Possess and apply contextual awareness” which underly much of the thought and action taking place throughout, even though they aren’t the central focus of any single paragraph. 2. Knowledge and Learning Being a reasoning system, these items are the bread and butter of ALMA. [2f - 2h] are central components of almost any reasoning system, and [2e] and [2j] are central features that distinguish ALMA from other systems. [2a], [2c], and [2i] are again more in the domain of a larger, integrated system. [2b] is a current research focus for the core of ALMA, and [2d] is an important piece of a demo currently under construction in which ALMA is being used as the control center for a robot. Notable Omissions It may seem that some obvious things are missing from this list. Take, for example, “communication between submodules.” Minds are complex systems with lots of specialized parts, and to navigate the world successfully an agent will almost certainly need to be able to use several of them together at once. However, this is the advantage of our approach. Perhaps submodules need to be able to communicate, or maybe specialized submodules aren’t even required at all — it makes no difference here! These are architectural considerations, whereas we (at least in this paper) are concerned purely with capabilities. 3. Goals, Planning, and Acting ALMA has very rudimentary planning facilities like [3b], specifically for executing tasks at future times, and [3a] and [3g] are active research topics. There is still much to be desired on the items in this section, although again this is mitigated by the fact that it is meant to be combined with other systems (which would handle planning and execution). 4. Real-World Considerations [4a] is the defining feature of ALMA. [4b] and [4d] have been written about extensively in papers relating to active logic (Miller and Perlis 1993), and this work has been partially implemented. There has been some work on [4e] (Nirkhe et al. 1997), but it has not been implemented. Not much has been done about [4f]. Of course, we still have likely omitted something that should be on the list. Again, this list is not meant to be complete or final — to the contrary, we encourage discussion and iteration on both the content and structure of the list. In our view, the development of such a list as this represents a significant step on the path to a standard model of mind, and can inform future work on intelligent agents and cognitive architectures. 5. Special Category Distinctions [5a] is arguably covered — an instance of ALMA is aware of its own existence as an entity, although this hasn’t been used for very much. Much has been written about how to handle [5c] in papers such as (Miller and Perlis 1993), although implementation has lagged behind here. Existing Systems and Architectures Of course, there are many systems and architectures that have been thought up and implemented prior to the creation of this list of desiderata. Originally, we had intended to evaluate a number established and/or recent systems, architectures, and models (such as SOAR (Laird 2012), ACT-R (Anderson et al. 2004), MIDCA (Cox 2013), DIARC (Schermerhorn et al. 2006), and HoA (Chaouche et al. 2014)) with respect to our list. However, realistic cognitive architectures like the ones we mentioned are tremendously complex, so a “complete” analysis of each one would draw too much focus from the rest of the paper, and anything less would be doing the creators of these systems a disservice. Instead, we invite these groups to examine their own work in the context of our list. To demonstrate how this might be done, we will take a look at our own ALMA (Purang 2001) system. Verdict Many of the items on our list of desiderata are represented or at least considered; this is perhaps to be expected, as the system and its underlying logic have been worked on by some of the very same minds as this paper. Other Lists Various other lists exist which can, at first blush, seem very similar to ours. Here, we’ll pick out a couple of representatives to demonstrate how our list distinguishes itself. Constraints on Mind In 1980, Allen Newell briefly presented a set of 13 constraints on the human mind (Newell 1980), a set which was later expanded upon by Anderson and Lebiere (Anderson and Lebiere 2003). Our list mostly subsumes this one, barring the last few items (on the 2003 formulation): “acquire capabilities through development”, “arise through Case Study: ALMA ALMA (short for Active Logic MAchine) is a generalpurpose reasoner which implements the titular active logic 458 evolution”, and “be realizable within the brain”. This is because that list is about constraints on the human mind, whereas we are looking at the more general class of intelligent minds, and so are not concerned with how the mind arises or whether it matches up with a particular physical system. Conference on Artificial Intelligence (AAAI-04), 1014– 1015. Laird, J. E.; Lebiere, C.; and Rosenbloom, P. S. 2017. A standard model of the mind: Toward a common computational framework across artificial intelligence, cognitive science, neuroscience, and robotics. AI Magazine. Laird, J. E. 2012. The Soar Cognitive Architecture. The MIT Press. Miller, M., and Perlis, D. 1993. A view of one’s past and other aspects of reasoned change in belief. (dissertation). Newell, A. 1980. Physical symbol systems. Cognitive science 4(2):135–183. Newell, A. 1990. Unified Theories of Cognition. Cambridge, Massachussetts: Harvard University Press. Nirkhe, M.; Kraus, S.; Miller, M.; and Perlis, D. 1997. How to (plan to) meet a deadline between now and then. Journal of Logic and Computation 7:109–109. Perlis, D.; Cox, M. T.; Maynord, M.; McNany, E.; Paisner, M.; Shivashankar, V.; Hand, E.; Shamwell, J.; Oates, T.; Du, T.; Josyula, D.; and Caro, M. 2013. A broad vision for intelligent behavior: Perpetual real-world cognitive agents. 2013 Annual Conference on Advances in Cognitive Systems: Workshop on Metacognition in Situated Agents. Purang, K. 2001. Alma/carne: Implementation of a timesituated meta-reasoner. In Proceedings of the International Conference on Tools with Artificial Intelligence, 103–110. Schermerhorn, P.; Kramer, J.; Middendorff, C.; and Scheutz, M. 2006. Diarc: A testbed for natural human-robot interaction. AAAI. Architecture Orthogonal to papers such as (Laird, Lebiere, and Rosenbloom 2017), which center more on the cognitive system architecture (the “how”), we focus on the possible types of thought (the “what”). We intend this list to serve as a complement to the more technically-minded papers in this collection, as well as to future papers — in order to design an architecture, you first have to have some idea of what you want that architecture to be able to do. Conclusion We began by working through a scenario of our design and considering what cognitive capabilities would be necessary to successfully negotiate it. We then collated these capabilities, and extended and generalized the resulting list into its current form. This list is not meant to be a final, definitive list of all the capabilities an agent must have to be considered intelligent in a general sense. However, it can serve as the prototype for such a list; as such, we welcome discussion and modification. If the research community is able to reach a consensus about what the contents of this list should be, then in turn it can serve as a foundation for a standard model of the mind. References Anderson, J. R., and Lebiere, C. 2003. The newell test for a theory of cognition. Behavioral and brain Sciences 26(5):587–601. Anderson, J. R.; Bothell, D.; Byrne, M. D.; Douglas, S.; Lebiere, C.; and Qin, Y. 2004. An integrated theory of the mind. Psychological Review III(4):1036–1060. Brody, J.; Perlis, D.; and Shamwell, J. 2015. Who’s talking - efference copy and a robot’s sense of agency. AAAI 2015 Fall Symposium. Chaouche, A.-C.; Seghrouchni, A. E. F.; Illi, J.-M.; and Sadouni, D. E. 2014. A higher-order agent model, with contextual planning management for ambient systems. LNCS Transactions on Computational Collective Intelligence XVI:146169. Cox, M. T. 2013. Midca: A metacognitive, integrated dualcycle architecture for self-regulated autonomy. UMIACS Technical Report No. UMIACS-TR-2013-03. Elgot-Drapkin, J.; Miller, M.; and Perlis, D. 1991. Memory, reason, and time: the step-logic approach. In Cummins, and Pollock., eds., Philosophy and AI: Essays at the Interface. Cambridge, MA: MIT Press. Josyula, D. P.; Anderson, M. L.; and Perlis, D. 2004. Domain-independent reason-enhanced controller for taskoriented systems - director. In Proceedings of the National 459