Academia.eduAcademia.edu

A Design Rationale for NASA TileWorld

1991

Automated systems that can operate in unrestricted real-world domains are still well beyond current computational capabilities.

• A Design i_iiiiiii_iiii!;!_ '¸ ANDREW KEITH MARK RESEARCH NASA PHILIPS SWANSON BRESINA BRANCH, AMES MOFFZTT TileWorld DRUMMOND JOHN AI NASA for Rationale MAIL RESEARCH FIELD, CA STOP 269-2 CENTER 94035 (415) 604-6527 (_!ASA-TM-IOT_q2) NASA TILENORLD A DFSIGN (NASA) IO RATIONALE p FuR N92-2662T G3/63 __A Ames Research Unclas 0091489 Center Artificial Jntel!igence Research Branch Technical Report FIA-91-04 April) 1991 _._ ,k . T 11- _t v A Design Andrew Mark Rationale B. E. for NASA Philips*, Keith Drummond*, NASA Research Mail Stop: Moffett J. John Ames Field, TileWorld Swanson, L. Bresina *t Center 244-17 CA 94035 Abstract Automated systems that can operate in unrestricted real-world domains are still well beyond current computational capabilities. This paper argues that isolating essential problem characteristics found in real-world domains allows for a careful study of how particular control systems operate. By isolating essential problem characteristics and studying their impact on autonomous system performance, we should be able to more quickly deliver systems for practical real-world problems. For our research on planning, scheduling, and control we have selected three particular domain attributes to study: exogenous events, uncertain action outcome, and metric time. We are not suggesting that studies of these attributes in isolation are sufficient to guarantee the obvious goals of good methodology, brilliant architectures, or first-class results; however, we are suggesting that such isolation facilitates the achievement of these goals. To study these three attributes, we have developed the NASA TileWorld. In this document, we describe the NASA TileWorld simulator in general terms, present an example NASA TileWorld problem, and discuss some of our motivations and concerns for NASA TileWorld. *Affiliated with Sterling Federal Systems tAlso affiliated with the Computer Science Department at Rutgers University _ ! 4 Introduction The ! W ) as world around us is replete "resource "dynamic", characteristics are intrinsic and process control. complicated Isolating essential problem planning, ezogenous study: those study that the eiTects time we refer studies of an action in isolation brilliant facilitates and carry benefits, out these a brain attributes, and stress, time Lac_ acterize our cut problem and however, research on attributes not By metric suggesting that goals of good are suggesting we we must are we mean on a real-world we build, to eren_ outcome, the obvious Working al- uniquely. are We domains Ezogenous action to guarantee developed NASA the TileWorld with the NASA TileWorld discuss definitions attributes were versatility, geometric of a precise what makes at what at the 1990 events, that problem attributes isolate TileWorld and simple [2]. parsimonious in general and These domain terms, of our motivatlons some language problems solution appear with to study to tkis an present concerns attribute for exogenous describing description In service events, of this action problem. terms knowl- opportunities amply demonstrated attributes. task of atte_npt_ng to _charto focus our research. We Our approach for domain approach, outcome these as multiple- domain reliability, discussion [1], numer- such incomplete sensor/elFector The workshop descriptive Workshop participants, aggra_rates the already difficult hard, and thus, on what area to be good experience. Metrics optimality, reasoning, for precisely t_nology by the suggested etc. and Benchmarks predictability, communication, is no coherent experimenter we have exogenous inter-agent offer no general first time. or task. that we describe session domain informability, for learning, there For our domain can not be identified the systems in NASA captured storming agency, that metric results; in such Attributes ous problem edge, and goals. can operate in real-world operate. By uncertain of these terms problem exploration remote that particular control. domain such These experimentation. example NASA TileWorld NASA TileWorld. During three or first-class achievement In this document, Domain systems are sufficient to understand three been have semantics. but systematic To study attributes architectures, the control of the etc. capabilities. found selected by the system taken attributes such computational outcome, direct properties methodology, has obvious action the system's to temporal of these isolation we have uncertain e_enta, not under events control systems or attributes characteristics of how particular and scheduling, current by characterized critical", as automated such automated intelligent are still well beyond be "time domains, task reliable, domains for a careful lows may that "unpredictable", to many Robust, activities with limited", attributes NASA uncertainty, TileWorld and a is to take and metric to refine allows an temporal properties. Certainly, these first definitions provided by NASA TileWorld will undergo significant change; however, we see no other way to eventually settle on a coherent terminology: generate-and-test appears to be our only search at this early strategy stage. f NASA TileWorld NASA TileWorld represents and a set of agents of tile characteristics, underlying haran, agent TileWorld C.F. sketched Mark related and J.L. Bresina of the NASA design and Drummond, Mark Boddy simulated were domain. a single mobile right, etc. The a grasped [6]. Though TileWorld simulator one whether or line-of-sight simulator is for type interaction the similar in name, cell at a time has three and control, and two to sense adjust the behavior winds), and actions (e.g., action to make attributes_ instance, of exogenous introduce As mentioned domain problem outcome th_ Kgent above, NASA ezogenous (e.g., events TileWorld e_enta, the frequency uncertainties somett_mes has in the uncertain been created action are TileWorld tiles ageni and to the release Can sense cell regardless time. The in table ....... and world (e.g., and customization. The control. and state The operate movement characterist_-ics of the "drop _ a:tile). to permit outcome, agent other behavior Off Co_se-or "veer" domains the access to and modi_ presentation allow the experimenter to create parameters tune simulator The of any for experimenter the SRI movable world display, are types at as up, E_t directions, direction. the contents interaction, controller an agent North c0mp_s im- period, tileworld NASA with .... latter the in the the current request of commands: types allow compass a tile, sense the time simulated populated Comm_ds. the agent agent in a given it is grasping three with The this same e.g., range grid of cells Bresina by Bresina, specifications. Around these of 1989, was refined simulator a particular t encodes "obstructions", commands TileWorld the by N_S. Slid - developed Summer independently; agent. Display commands allow the experimenter to have of the NASA TileWorld display. Customization commands NASA and grid topology, In the agent (see Figure 2). The grid is oriented agen_ can grasp tiles in adjacent cellSint_our 1 summarizes The vary in terms this sketch out by Philips. developed is a two-dimensional determine of distance first gridworld in nature. tile, and move its location, Figure domains simulator The the spectrum tile domain domain; to form other NASA a set of tiles, along in [5]). (reported was carried The a grid of cells, vs. multi-agent, agent TileWorld of the specifications [4] and Sutton's rather different Points to the sliding plementation related tiles. single capabilities, is historically Schmidt, the initial Philips, and move involving of the grid. physics NASA can grasp which of domains a spectrum study a.u.d metric agent's _ of three _irae. speed), of the effector _ specific The ez: INASA THeWoHdis writtefi _in _anZ _Alle_oCoinmon Lisp and _avail_bie for public use. Email requests for copies of the code or manual should be sent to "[email protected]'. [[ Sensors Effectors grasp compass-direction attached compass-direction release compass-direction in z, y move-agent compass-direction my-location world-time ogenous the blows in NASA events interior. A wind a tile along or is blown bilistic say of the are time the agent Metric the has action has can outcomes to grasp 1% of the achieved time is an aspect the controller intended of the simulator's time needs a _fid's velocity, tasks, can tune the following we have period' and concentrated an agent's on goals maintain (or prevent) some property property by a time deadline; these tempo_ are NASA constitutes it seems all play provides TileWorld or problem-solvers a real-time clear that an important a simple problem, exogenous role and events, in real-time action it is possible tile tiles other being about the world to 19% grasped the success, to determine outcome of the temporal Examples and wit h the that can properties of such achieve goals a wind's goals of include: (or destroy) some of "achievement", To allow a controller to evaluate provides a "dock" sensor. means There propose tasks of the simulator: of time, problems. we do not a proba- is grasped, the to sense _ent. in contrast but effective on "real-time" First, of an environment over time, an properties. In NASA TileWorld, without deadlines, typically used in c!assical ,/LI pla_ni-ng. its progress with respect to temporal goals, the simulator trollers ways. for instance, any t In our study over an interval goals object time temporaiproperties velocity. with another for an example). as well as the domain operation metric A wind effects. be posed _ Since the simulator ch_acterizes the evoluti_on experimenter Should be able to influence its metric temporal an expire-enter towards a period. encounters no information return borders has in two 80% of the the from and section be specified; calls to the effectors the agent it either next TileWorld a tiie, and blow a range, when (see the in NASA no effect, because of an action, or duration if (or when) action attempts command Second, dropped. failure, the tile stops range is realized of alternative when that wind's Actions which or row, has column and outcome action model that path of that of wind are gusts on a single a dear to the limit Uncertain TileWorld acts Agent TileWorld Figure 1: NASA for testing the is considerable to resolve uncertainty, the and issue behavior debate here. temporal of conon what However, dependence problems. SThere is currently no facility for making the errors of action execution vary as a function of time. o _D" r O Figure 2: The Windy The Maze Windy TileWorld graphics window which contains a sample problem we call Figure 2 shows a NASA the Windy Maze The agent islocated in the lower-leftcomer of the grid;tilesare distributed Maze. throughout the grid. The arrows positioned along the borders of the image d-esc_b-e_winds that can blow tilesalong the row or coition to which they are pointing. inthe upper-ieft comer. T_he 3 indicates that the wind has arange : Consider _-e _row of three from the border celland moves a tileto the fourth cellin the top row, p_ovideddthat there are no obstructions. So, a tilein the up_per-qe_cor_erofthe g_d would-b_e i_lown_3 grid cells _Xwduld one cell to the time in seconds to the right-by randomly fight t_i-at _rind. by that between wind successive No_e-tha-t-tH_ due to tIie gusts. _ctag0_fi 'markecl _k_aarr-aJi-ge_t of 3. In this example,the period within the [5..20]interval. _ _=_ - onl_Y_e:bi_n A @indcS of the -: i pe_r/'od is the wind -: fluctuates _=_ _ | In __y_e upper first right glance (5, 5) without it may appear fromthe agent's see that two different marked above than J l_robiem, comer X slides square current back must using grasp "paths" and position the lmposslble location I. Also.the its current to be the agent forth octagon possible. in the top Y slides and no higher _0m or release to achieve to the goal. are mo_e the lower co_mand_ this goal, Oncethe-wixid_are When ta_e_ there row between the cell above up and the far right column the Given one row 4 below top. a_-d_es. is no cqear i-n_o accoun2, are set down the_ comer-((),_0)'tp to move winds than the since left in motion square tt and these route i _an the going ! At octagon the cell no lower exogenous :i events, there are two possible the agent west, Our has a clear experience the average systems. with occurrence person, The behavior of the available problem and there in whatever depends manner controllers actions way needed to move of the "well" the controller performed. types of agent controller What/8 important the specific metric used is a complicated of metrics available. from on the (0, 0) specific to allocate and controllers and it method is how the controller's can we judge is to see if the at how a given is free The controller agent performance Clearly, objectives. The selection source of future work. and Many metric the agent agent the agent for the reasoning seconds the specific fit. But, "real-time" eleven in the simple No matter and instance. By what to judge option to the immediately their actions the goal of having on the speed goal is not important. be compared? One possible minutes it deems current at 8:22:49. four problem can be imagined. given is presented by the used of most has been is only is apparently 8:27:00. If not, the agent controller simply fails to satisfy the goal. easy to evaluate, its binary distinction does not supply very much scientific possible comer. it is not a "puzzle" that this problem the capability this given during methods While the problem of time uses to accomplish the given performance is evaluated. controller? has indicated the controller problem, amount winds time can events. is inevitably exact solving How to the upper-right to the goal. problem it is beyond let us assume time to (5, 5). the of exogenous "solves _ the of this through this specific (5, 5) by 8:27:00, controller Some route we feel that For example, location can move the agent of the word. When a person is given the problem, they almost sense knowledge to bear and reason that they must interleave classical sense bring common with by which paths is to move around the D tile when Y slides out of the way. The second park between tiles H and I when X is above I. Once tile X has blown The first option is to temporarily the agent of a performance is in cell (5, 5) at Although this metric is information about how an experimenter's reflects and interesting issue and a Discussion Selecting tributes cation relevant domain attributes facilitates careful analysis of problems attributes. that Simple still retaining domain some The simplicity of the Most is hard and to isolate While NASA analyze TileWorld NASA facilitates analysis and attributes simulator the underlying have reasons a simulation 5 allows discussion systematic for success environment and at- specifi- "interesting" of problems while real-world tasks. in many study interconnected those precise be considered found fa_:ilitates so many containing TileWorld challenging problems provides subset problem might TileWorld real-world a simplified of what semantics essential, of the NASA for performance. defining of a controller. a selected exhibit v and of the reasoas attributes that failure. for studying specific domain it attributes, the NASA environment that in fact, has a solid grasp simulator with the simulated that the system can appear (e.g., more realistic, appears fares [3]) may, to forest domain TileWorld be even overly a simulation more of a task, the "solves" Usually, Given simulation's easy the only unwary when, in fact, 8 responding designer of the of a system's for the analog, real-world agents a description it is dangerously a simulated However, of autonomous simplistic. of its real complexities. version simplistic. succes_ to infer reader is not such the ca,se. An additional worry about and removing essential removing information, was simple instance, in the one mal0es domain, case, abstracted difficult what might version. it is the attributes study for scientific in some significant a real problem and that abstracted classical of the by isolating have been relevant in the simplicity that are altered Identifying possible by isolating problematic sterile be argued and it is our belief problem, the myriad that become can It can also attributes those In this task argue might is that information, we might significantly alter the original problem. By the simplification may lead to artificially hard problems; that which original it so difficult. our approach attributes way and irrelevant information we will make progress blocks made disappear in the in a real-world only that problem to manage. easier can task For world a given from thus, in the original version. task is a by exploring some of variations. Conclusions There is an important tributes the while jettisoning TileWorld NASA problems metric time. of available underlying There and also help the first to admit simulation many simulators semantic with certain are foster more that problems precise the and environments. others like it, will help ttributes. domain and attributes that viewed as a single at- implemented allows outcome action domain one to study uncertainty, cannot and be expressed element in in an array useful mechanism for systematically studying the The NASA TileWorld domain is easy to describe among comparison we expect that of a common = 6 researchers. between simulator TileWorld However, selected designed TileWorld NASA eventa, communication in the construction one to study We have this simulator, empirical NASA baggage. this in mind. we feel that can facilitate allow that of ezogenous types represents a simple and for system performance. thus, can sive simulator However, tools, reasons modify, irrelevant involve that this simulator. and role for simple is not various the the NASA vocabulary last The simulator approaches. word TileWorld of problems itself We are in comprehendomain, and and domain Y Acknowledgements Thanks to Smadar Kedar Thanks to Smadar, Rich, and Levinson Rich and John Allen for help design with the for comments on a draft of NASA of this TileWorld. paper. References [1] Drummond, Tasks M., and & Kaelbling, Evaluation Approaches to Planning, Kaufmann, San Mateo, [2] Philips, A.B., [3] Cohen, 10(3): Scheduling J.L. Moffett P.R., Design Control, Agent Architectures: DARPA Workshop San Diego, CA. Benchmark on Innovative pp. 408-411. Morgan- TileWorld CA: NASA M.L., Ames D.M., Hart, Research & Howe, for Agents Requirements (NASA Manual Center, A.E. in Complex 1989. Technical Code Trim Report FIA. By Fire: Environments. Under- AI Magazine, 32-48. [4] Pollack, M.E., & Ringnette, uating Agent Architectures. Intelligence (pp. 183-189), [5] Sridharan, Strategy N.S., [6] Sutton, Based ternational Kaufrnann and Acquisition CBM-TR-137). ence. ing and 1991. NASA Field, Greenberg, the standing of the Proceedings CA. & Bresina, TR-FIA-91-04). Integrated L. 1990. Metrics. New Richard on J.L. - A Proposal Brunswick, S. 1990. Introducing Proceedings Menlo Park, Bresina, Exploration (Rutgers N J: Rutgers Technical University, Architectures Dymuaic on Machine the Tileworld: of the Eighth National CA: AAAI Press. 1984. Integrated Approximating Conference Publishers. M. 1990. Programming. Learning 7 (pp. Experimentally Conference of Problem Report on Artificial Reformulation and RU-LCSR-TR-53; Department for Learning, Proceedings 216-224), Eval- San RU- of Computer Planning, of the Marco, and Sci- React- Seventh CA: In- Morgan _ - J 8 i c _J_,r REPORT PUb! C reoortlne gathering <oHe_t_on Daws burden for 3nd _a,nta of informatiOn, Highway, Suite n ng this the data including 1204. DOCUMENTATION -oHedion Of information ,s _st_mated needed, and completing_and suggestions for reducing Arlington, VA 22202-4302, and 1. AGENCY USE ONLY (Leave blank) this tO the to reviewing burden, Office PAGE a_er:_ge to of 1 hour per the collection Washington Management rpsponse of in:formatton Headquarters and Budget. 2. REPORT DATE Dates attached I including Send Services, Paperwork the time comments Directorate Reduction for reviewing instructions, regarding this for Information Project (0704-01B8), or and Washington. 5. FUNDING - searching burden estimate Operations exlsting any other Reports, OC data sour(@:, aspect of _ts 12 t5 Jeffer_un 20_03. 3. REPORT TYPE AND DATES COVERED 4. TITLE AND SUBTITLE Titles/Authors _HI.Jf_Jv_u OMB NO. 0704-0188 NUMBERS Attached 6. AUTHOR(S) 7. PERFORMING ORGANIZATION Code FIA - NAME(S) Artificial Intelligence Information 9. SPONSORING/MONITORING Nasa/Ames Moffett Field, Research Sciences AGENCY NAME(S) Research 8. PERFORMING ORGANIZATION REPORT NUMBER AND ADDRESS(ES) Branch Attached Division 10. SPONSORING / MONITORING AGENCY REPORT NUMBER AND ADDRESS(ES) Center CA. 94035-1000 11. SUPPLEMENTARY NOTES 12a. DISTRIBUTION/AVAILABILITY Available for Public D istr -S_///#2_ 13. ABSTRACT (Maximum Abstracts 12b. DISTRIBUTION STATEMENT CODE ibut ion BRANCH CHIEF 2OOwords) ATTACHED 15. NUMBER OF PAGES 14. SUBJECT TERMS i6. PRICE CODE 17. SECURITY CLASSIFICATION OF REPORT NSN 7540-01-280-5500 18. SECURITY CLASSIFICATION OF THIS PAGE 19. SECURITY CLASSIFICATION OF ABSTRACT 20. LIMITATION OF ABSTRACT Standard Prescribed 298-102. Form 298 (Rev 2-89) by ANSI Std Z39-1B