Pyomo is an open source Python package for modeling and solving mathematical programs. It allows users to define abstract and concrete optimization problems and solve them with standard solvers. Pyomo provides modeling capabilities similar to algebraic modeling languages like AMPL but within a full-featured Python environment, allowing for more flexibility and customization compared to commercial tools. Pyomo leverages the Coopr library which provides interfaces to various solvers and supports parallel execution.
Pyomo is an open source Python package for modeling and solving mathematical programs. It allows users to define abstract and concrete optimization problems and solve them with standard solvers. Pyomo provides modeling capabilities similar to algebraic modeling languages like AMPL but within a full-featured Python environment, allowing for more flexibility and customization compared to commercial tools. Pyomo leverages the Coopr library which provides interfaces to various solvers and supports parallel execution.
Pyomo is an open source Python package for modeling and solving mathematical programs. It allows users to define abstract and concrete optimization problems and solve them with standard solvers. Pyomo provides modeling capabilities similar to algebraic modeling languages like AMPL but within a full-featured Python environment, allowing for more flexibility and customization compared to commercial tools. Pyomo leverages the Coopr library which provides interfaces to various solvers and supports parallel execution.
Pyomo is an open source Python package for modeling and solving mathematical programs. It allows users to define abstract and concrete optimization problems and solve them with standard solvers. Pyomo provides modeling capabilities similar to algebraic modeling languages like AMPL but within a full-featured Python environment, allowing for more flexibility and customization compared to commercial tools. Pyomo leverages the Coopr library which provides interfaces to various solvers and supports parallel execution.
Pyomo: Modeling and Solving Mathematical Programs in Python William E. Hart Jean-Paul Watson David L. Woodru Received: December 29, 2009. First revision received: February 25, 2011. Abstract We describe Pyomo, an open source software package for modeling and solving mathematical programs in Python. Pyomo can be used to dene abstract and concrete problems, create problem instances, and solve these instances with standard open-source and commercial solvers. Pyomo provides a capability that is commonly associated with algebraic modeling languages such as AMPL, AIMMS, and GAMS. In contrast, Pyomos modeling objects are embedded within a full-featured high-level programming language with a rich set of supporting libraries. Pyomo leverages the capabilities of the Coopr software library, which together with Pyomo is part of IBMs COIN-OR open-source initiative for operations research software. Coopr integrates Python packages for dening optimizers, modeling optimization applications, and man- aging computational experiments. Numerous examples illustrating advanced scripting applications are provided. Keywords Python algebraic modeling language optimization open source optimization software William E. Hart Sandia National Laboratories Data Analysis and Informatics Department PO Box 5800, MS 1318 Albuquerque, NM 87185 E-mail: [email protected] Jean-Paul Watson Sandia National Laboratories Discrete Math and Complex Systems Department PO Box 5800, MS 1318 Albuquerque, NM 87185 E-mail: [email protected] David L. Woodru Graduate School of Management University of California Davis Davis, CA 95616-8609 E-mail: [email protected] 2 1 Introduction The Python Optimization Modeling Objects (Pyomo) software package supports the denition and solution of optimization applications using the Python scripting lan- guage. Python is a powerful high-level programming language that has a very clear, readable syntax and intuitive object orientation. Pyomo includes Python classes for dening sparse sets, parameters, and variables, which can be used to formulate alge- braic expressions that dene objectives and constraints. Pyomo can be used to represent linear, mixed-integer, non-linear, and non-linear mixed-integer models for large-scale, real-world problems that involve thousands of constraints and variables. The introduction of Pyomo was motivated by a variety of factors that impact op- timization applications at Sandia National Laboratories, a large US Department of Energy Laboratory at which two of the co-authors are employed. Sandias discrete mathematics group has successfully used AMPL [3, 19] to model and solve large-scale integer programs for many years. Our application experience highlighted the value of Algebraic Modeling Languages (AMLs) for solving real-world optimization applica- tions, and AMLs are now an integral part of operations research solutions at Sandia. Our experience with these applications has also highlighted the need for more ex- ible AML frameworks. For example, direct integration with a high-level programming language is needed to allow modelers to leverage modern programming constructs, ensure cross-platform portability, and access the broad range of functionality found in standard software libraries. AMLs also need to support extensibility of the core modeling language and associated solver interfaces, since complex applications typi- cally require some degree of customization. Finally, open-source licensing is required to manage costs associated with the deployment of optimization solutions, and to fa- cilitate the integration of modeling capabilities from a broader technical community. Pyomo was developed to provide an open-source platform for developing optimiza- tion models that leverages Pythons rich high-level programming environment to fa- cilitate the development and deployment of optimization capabilities. Pyomo is not intended to facilitate modeling better than existing, primarily commercial AML tools. Instead, it supports a dierent modeling approach in which the software is designed for exibility, extensibility, portability, and maintainability. At the same time, Pyomo incorporates the central ideas in modern AMLs, e.g., dierentiation between abstract models and concrete problem instances. Pyomo is a component of the Coopr software library, a COmmon Optimization Python Repository [12]. Coopr includes a exible framework for applying optimizers to analyze Pyomo models, interfaces to well-known linear, mixed-integer, and non- linear solvers, and provides an architecture that supports parallel solver execution. Coopr also includes an installation utility that automatically installs the diverse set of Python packages that are used by Pyomo. Coopr is hosted both as part of IBMs COIN-OR open-source software initiative for operations research [10] and at Sandia, and it is actively developed and maintained by Sandia and its collaborators. The remainder of this paper is organized as follows. Section 2 describes the mo- tivation and design philosophy behind Pyomo, and Section 3 discusses why Python was chosen for the design of Pyomo. Section 4 describes related high-level modeling approaches, and Pyomo is briey contrasted with other Python-based modeling tools. In Section 5, we discuss the novelty and impact of Pyomo, answering the related ques- tions Why another AML? and Why Pyomo?. Section 6 describes the use of Pyomo on a simple application that is also described in AMPL for comparison, introduces a 3 number of advanced features of Pyomo, and discusses Pyomo run-time performance. Section 7 describes how Pyomo leverages the broader solver capabilities in Coopr. Sec- tion 8 illustrates the use of Pyomo by discussing a number of advanced case studies. Section 9 provides information for getting started with Coopr, and Section 10 describes future work that is planned for Pyomo and Coopr. 2 Design Goals and Requirements The following sub-sections describe the design goals and requirements that have guided the development of Pyomo. The design of Pyomo has been driven by two dierent types of projects at Sandia. First, Pyomo has been used by research projects that require a exible framework for customizing the formulation and evaluation of optimization models. Second, projects with external customers frequently require that optimization modeling techniques be deployed without the need for commercial licenses. 2.1 Open Source Licensing A key goal of Pyomo is to provide an open source optimization modeling capabil- ity. Although open source optimization solvers are widely available in packages like COIN-OR [10], comparatively few open source tools have been developed to model optimization applications. The open source requirement for Pyomo is motivated by several factors: Transparency and Reliability: When managed well, open source projects facil- itate transparency in software design and implementation. Because any developer can study and modify the software, bugs and performance limitations can be iden- tied and resolved by a wide range of developers with diverse software experience. Consequently, there is growing evidence that managing software as open source can improve its reliability and that open source software exhibits similar defect evolution patterns as proprietary software [4, 56]. Flexible Licensing: A variety of signicant operations research applications at Sandia have required the use of a modeling tool with a non-commercial and non- infectious license. There have been many dierent reasons for this requirement, including the need to support open source analysis tools, limitations for software deployment on classied computers, and licensing policies for commercial partners (e.g., who are motivated to minimize the cost of deploying an application model internally within their company). The Coopr software library, which contains Py- omo, is licensed under the BSD [9]. BSD has fewer restrictions for commercial use than alternative open source licenses like the GPL [24], and is non-infectious. The use of an open source software development model is not a panacea; ensuring high reliability of the software still requires careful management and a committed de- veloper community. However, there is increasing recognition that open source software provides many advantages beyond simple cost savings [11], including supporting open standards and avoiding being locked into a single vendor. 4 2.2 Customizable Capability A key limitation of proprietary commercial modeling tools is customization of the modeling components or optimization processes. Pyomos open source project model allows a diverse range of developers to prototype new capabilities. Thus, developers can customize the software for specic applications, and can prototype capabilities that may eventually be integrated into future software releases. More generally, Pyomo is designed to support a stone soup development model in which each developer scratches their own itch. A key element of this design is the plugin framework that Pyomo uses to integrate software components like optimizers, optimizer managers, and optimizer model format converters. The plugin framework manages the registration of components, and it automates the interaction of these components through well-dened interfaces. Thus, users can customize Pyomo in a modular manner without the risk of destabilizing core functionality. 2.3 Solver Integration Modeling tools can be roughly categorized into two classes based on how they integrate with optimization solvers. Tightly coupled modeling tools directly access optimization solver libraries (e.g., via static or dynamic linking). In contrast, loosely coupled modeling tools apply external optimization executables (e.g., through the use of system calls). Of course, these options are not exclusive, and a goal of Pyomo is to support both types of solver interfaces. This design goal has led to a distinction in Pyomo between model formulation and optimizer execution. Pyomo uses a high level (scripting) programming language to formulate a problem that can be solved by optimizers written in low-level (compiled) languages. This two-language approach leverages the exibility of the high-level lan- guage for formulating optimization problems and the eciency of the low-level language for numerical computations. 2.4 Abstract Models and Concrete Instances A requirement of Pyomos design is that it support the denition of abstract models in a manner similar to that of AMPL [3, 19] and AIMMS [2, 47]. An abstract model separates the declaration of a model from the data used to generate a specic model instance; the advantages of this approach are widely appreciated, e.g., see Fourer et al. [18, p.35]. This separation provides an extremely exible modeling capability, which has been leveraged extensively in optimization applications developed at Sandia. To mimic this capability, Pyomo uses a symbolic representation of data, variables, constraints, and objectives. Model instances are then generated from external data sources using construction routines that are provided by the user when dening sets, parameters, and other modeling components. Further, Pyomo is designed to use data specied in the AMPL format to facilitate the translation of models between AMPL and Pyomo. Pyomo can also create concrete instances directly from specic parameter data, bypassing the abstract modeling layer. This functionality is similar to that provided in 5 the open-source modeling tools PuLP and PyMathProg, in addition to capabilities in many commercial AMLs. 2.5 Flexible Modeling Language Another goal of Pyomo is to directly use a modern high-level programming language to support the denition of optimization models. In this manner, Pyomo is similar to tools like FlopC++ [17] and OptimJ [36], which support modeling in C++ and Java respectively. The use of a broad-based high-level programming language to develop optimization models has several advantages: Extensibility and Robustness: A widely used high-level programming language provides a robust foundation for developing and solving optimization models: the language has been well-tested in a wide variety of application contexts and de- ployed on a range of computing platforms. Further, extensions almost never require changes to the core language but instead involve the denition of additional classes and functional routines that can be immediately leveraged in the modeling process. Further, support of the language itself is not a long-term factor when managing software releases. Documentation: Modern high-level programming languages are typically well- documented, and there is often a large on-line community to provide feedback to new users. Standard Libraries: Languages like Java and Python have a rich set of libraries for tackling just about every conceivable programming task. For example, stan- dard libraries can support capabilities like data integration (e.g., working with spreadsheets), thereby avoiding the need to directly support such capabilities in a modeling tool. An additional benet of basing Pyomo on a general-purpose high-level programming language is that we can directly support modern programming language features, in- cluding classes, looping and procedural constructs, and rst-class functions all of which can be critical when dening complex models. By contrast, such features are not uniformly available in commercial AMLs. Pyomo is implemented in Python, a powerful dynamic programming language that has a very clear, readable syntax and intuitive object orientation. However, when com- pared with AMLs like AMPL, Pyomo has a more verbose and complex syntax. For ex- ample, declarations like set denitions can be expressed as inlined-functions in AMPL, but they require a more verbose syntax in Python because it supports a more generic computing model. Thus, a key issue with this approach concerns the target user com- munity and their level of comfort with standard programming concepts. Our examples in this paper compare and contrast AMPL and Pyomo models, which illustrate this trade-o. 2.6 Portability A requirement of Pyomos design is that it work on a diverse range of computing plat- forms. In particular, inter-operability between Microsoft Windows, Linux, and Macin- tosh platforms is a key requirement for many Sandia applications. For example, user 6 front-ends are often GUIs on a Windows platform, while the computational back-end may reside on a Linux cluster. The main impact of this requirement has been to limit the choice of the high-level programming language used to develop Pyomo. In particu- lar, the Microsoft .Net languages were not considered for the design of Pyomo due to portability considerations. 3 Why Python? We chose to develop Pyomo using the Python [44] programming language for a variety of reasons. First, Python meets the criteria outlined in the previous section: Open Source License: Python is freely available, and it has a liberal open source license that allows users to modify and distribute a Python-based application with few restrictions. Features: Python has a rich set of native data types, in addition to support for object oriented programming, namespaces, exceptions, and dynamic module load- ing. Support and Stability: Python is stable, widely used, and is well supported through newsgroups, web forums, and special interest groups. Documentation: Users can learn about Python from both extensive online doc- umentation and a number of excellent books. Standard Library: Python includes a large number of useful modules, providing capabilities for (among others) data persistence, interprocess communication, and operating and le system access. Extensibility and Customization: Python has a simple model for loading Python code developed by a user. Additionally, compiled code packages (e.g., NumPy and SciPy) that optimize computational kernels can be easily integrated. Python in- cludes support for shared libraries and dynamic loading, so new capabilities can be dynamically integrated into Python applications. Portability: Python is available on a wide range of compute platforms, so porta- bility is typically not a limitation for Python-based applications. We considered several other popular programming languages prior to developing Pyomo. However, in most cases Python appears to have distinct advantages: .Net: As mentioned earlier, the Microsoft .Net languages are not portable to Linux platforms, and thus they were not suitable for Pyomo. Ruby: At the moment, Python and Ruby appear to be the two most widely recom- mended scripting languages that are portable to Linux platforms, and comparisons suggest that their core functionality is similar. Our preference for Python is based in part on the fact that it has a nice syntax that does not require users to enter obtuse symbols (e.g., $, %, and @). Thus, we expect Python will be a more natu- ral language for expressing optimization models. Further, the Python libraries are more integrated and comprehensive than those for Ruby. Java: Java has many of the same strengths as Python, and it is arguably as good a choice for the development of Pyomo. However, Python has a powerful interactive interpreter that allows real-time code development and encourages experimentation with Python software. Thus, users can work interactively with Pyomo models to become familiar with these objects and to diagnose bugs. 7 C++: Models formulated with the FlopC++ [17] package are similar to models developed with Pyomo. Specically, the models are specied in a declarative style using classes to represent model components (e.g., sets, variables, and constraints). However, C++ requires explicit compilation to execute code, and it does not sup- port an interactive interpreter. Thus, we believe that Python will provide a more exible language for users. Finally, we note that run-time performance was not a key factor in our decision to use Python. Recent empirical comparisons suggest that scripting languages oer reasonable alternatives to languages like C and C++, even for tasks that must handle fair amounts of computation and data [38]. Further, there is evidence that dynamically typed languages like Python allow users to be more productive than with statically typed languages like C++ and Java [53, 45]. It is widely acknowledged that Pythons dynamic typing and compact, concise syntax facilitates rapid software development. Thus, it is not surprising that Python is widely used in the scientic community [34]. Large Python projects like SciPy [30] and SAGE [50] strongly leverage a diverse set of Python packages to perform complex numerical calculations. 4 Background A variety of dierent strategies have been developed to facilitate the formulation and solution of complex optimization models. For restricted problem domains, optimiz- ers can be directly interfaced with application modeling tools. For example, modern spreadsheets like Excel integrate optimizers that can be applied to linear programming and simple nonlinear programming problems in a natural way. Algebraic Modeling Languages (AMLs) are an alternative approach that allows ap- plications to be interfaced with optimizers that can exploit problem structure. AMLs are specialized high-level programming languages for describing and solving mathemat- ical problems, particularly optimization-related problems [31]. AMLs like AIMMS [2], AMPL [3, 19], and GAMS [21] provide programming languages with an intuitive mathe- matical syntax that support concepts like sparse sets, indices, and algebraic expressions. AMLs provide a mechanism for dening variables and generating constraints with a concise mathematical representation, which is essential for large-scale, real-world prob- lems that involve thousands or millions of constraints and variables. Standard programming languages can also be used to formulate optimization mod- els when used in conjunction with a software library that uses object-oriented design to support mathematical concepts. Although these modeling libraries sacrice some of the intuitive mathematical syntax of an AML, they allow the user to leverage the greater exibility of standard programming languages. For example, modeling tools like FlopC++ [17] and OptimJ [36] can be used to formulate and solve optimization models in C++ and Java respectively. A related strategy is to use a high-level programming language to formulate opti- mization models, which are then solved with optimizers written in low-level languages. This two-language approach leverages the exibility of the high-level language for for- mulating and manipulating optimization problems and the eciency of the low-level language for numerical computations. This approach is increasingly common in sci- entic computing tools, and the Matlab TOMLAB Optimization Environment [52] is probably the most mature optimization software using this approach. Python has also been used to implement a variety of optimization packages that use this approach: 8 APLEpy: A package that can be used to describe linear programming and mixed- integer linear programming optimization problems [5, 32]. CVXOPT: A package for convex optimization [14]. PuLP: A package that can be used to describe linear programming and mixed- integer linear programming optimization problems [40]. POAMS: A modeling tool for linear and mixed-integer linear programs that de- nes Python objects for symbolic sets, constraints, objectives, decision variables, and solver interfaces. PyMathProg: A package that includes PyGLPK, which encapsulates the func- tionality of the GNU Linear Programming Kit (GLPK) [41]. OpenOpt: A numerical optimization framework that is closely coupled with the SciPy scientic Python package [35]. Pyomo is closely related to APLEpy, PyMathProg, PuLP, and POAMS. All of these packages dene Python objects that can be used to express optimization models, but they can be distinguished according to the extent to which they support abstract models. Abstract models provide a data-independent specication of a mathematical program. Fourer and Gay [19] summarized the advantages of abstract models when presenting AMPL: The statement of the abstract model can be made compact and understandable The independent specication of an abstract model facilitates the specication of the validity of the associated data Data from dierent sources can be used with the abstract model, depending on the computing environment PuLP and PyMathProg do not support abstract models; the concrete models that can be constructed by these tools are driven by explicit data. APLEpy supports symbolic denitions of set and parameter data, but the objective and constraint specications are concrete. POAMS and Pyomo support abstract models, which can be used to generate concrete instances from various data sources. Pyomo also provides an automated con- struction process for generating concrete instances from an abstract model. Hart [26] provides Python examples that illustrate the dierences between PuLP, POAMS and Pyomo. 5 Why Pyomo: The Value Proposition Before introducing Pyomo and subsequently illustrating advanced capabilities and scripting features, we rst discuss the value proposition for Pyomo. In other words, we will address the related questions of Why another AML? and Why Pyomo in particular? In doing so, we argue that Pyomo holds a unique position in the eld of both commercial and open-source AMLs. In what follows, we provide forward ref- erences to subsequent sections containing examples that support our arguments. Our immediate goal in this section is simply to outline the arguments. The following features are what we view as the primary benets of Pyomo: Open-Source with Flexible Licensing: As we argued in Sections 2 and 3, a exible open-source license is often critical when deploying real-world applications. Further, open-source software is customizable and typically more extensible than commercial alternatives. Pyomo is obviously dierentiated from commercial AMLs 9 in this regard, and it is dierentiated from open-source AMLs such as PyMathProg that use restrictive licenses. Embedded in a High-Level, Full-Featured Programming Language: Com- mercial AMLs are embedded in proprietary languages, which typically lack the spectrum of language features found in modern high-level programming languages, e.g., functions, classes, and advanced looping constructs. For example, the lack of functions in AMPL makes programming many of the types of advanced applications shown in Section 8 very dicult. Other open-source AMLs share this advantage with Pyomo. Access to Extensive Third-Party Library Functionality: By embedding Py- omo in Python, users immediately gain access to a remarkable range of powerful and free third-party libraries, including: SciPy, NumPy, Matplotlib, Pyro, and various database / spreadsheet interfaces. Support for Abstract and Concrete Math Programming Models: As we argued in Section 2.4, the separation of model from data is extremely useful in practice. While commercial AMLs support this distinction, Pyomo is the only open- source AML to fully support the specication of both abstract and concrete math programming models. Support for Non-Linear Math Programming Models and Solvers: While many commercial AMLs provide capabilities to express and solve non-linear pro- grams, few open-source tools support non-linear capabilities; OpenOpt [35] is a notable counter-example. This Pyomo capability is discussed in Section 6.4. Integrated Support for Distributed Computation: Pyomo provides inte- grated support for distributed computation (see Section 7.2) by leveraging and extending capabilities in both Python and the Python library Pyro. This capabil- ity is unique in the context of open-source AMLs, and is uncommon in commercial AMLs. Support for distributed computation is becoming increasingly critical for application deployment, given the growing availability of cluster-based and cloud computing. Cross-Platform Deployment Capabilities: By embedding Pyomo in Python, we are able to rapidly deploy applications in cross-platform environments. Ex- amples of such environments are discussed in Section 7.2, drawn from real-world computing environments exercised by the co-authors and their collaborators. Simi- lar capabilities could be embedded in related open-source AMLs such as PuLP and APLEpy, but are not currently available. Integrated Support for Obtaining Data from External Sources: Pyomo provides a rich range of interfaces to extract data from structured text les, spread- sheets, and databases, driven by the requirements for real-world deployment of op- timization applications. As discussed in Section 6.5, this capability distinguishes Pyomo from other open-source AMLs. Extensibility via Component-Based Software Architecture: Drawing on best practices in commercial software design, Pyomo and Coopr are designed in a modular, component-based fashion. In particular, users can provide extensions without impacting the core software design, e.g., in the form of customized solvers and problem writers. Such functionality is discussed in Section 7.3, along with an example of a simple custom solver. The ability to seamlessly extend the core functionality is unique among open-source and commercial AMLs. Advanced Application Scripting: Perhaps the most important feature of Py- omo is the comparable ease with which applications requiring advanced scripting 10 can be developed. For complex optimization applications beyond those requiring a simple express model, generate model, solve model capability, such scripting is often integral to successful deployment. Of course, it is dicult to quantify ease in this context, and we make no attempt to do so here. Rather, we point to specic examples illustrated in Section 8. 6 Pyomo: An Overview Pyomo can be used to dene abstract models, create concrete problem instances (both directly and from abstract models), and solve those instances with standard solvers. Pyomo can generate problem instances and apply optimization solvers within a fully ex- pressive programming language. Pythons clean syntax allows Pyomo to express math- ematical concepts in a reasonably intuitive and concise manner. Further, Pyomo can be used within an interactive Python shell, thereby allowing a user to interactively interrogate Pyomo-based models. Thus, Pyomo has most of the advantages of both AML interfaces and modeling libraries. 6.1 A Simple Example In this sub-section, we illustrate Pyomos syntax and capabilities by demonstrating how a simple AMPL example can be replicated with Pyomo Python code. We focus here on the specication of symbolic or abstract models; concrete or directly specied models are discussed in Section 6.3. Consider the following AMPL model prod.mod, which is available from www.ampl. com/BOOK/EXAMPLES/EXAMPLES1: s e t P; param a { j in P}; param b ; param c { j in P}; param u { j in P}; var X { j in P}; maximize Tot al Pr of i t : sum { j in P} c [ j ] X[ j ] ; s ubj e c t to Time : sum { j in P} (1/ a [ j ] ) X[ j ] <= b ; s ubj e c t to Li mi t { j in P}: 0 <= X[ j ] <= u [ j ] ; To translate this AMPL model into Pyomo, the user must rst import the Pyomo Python package and create an Pyomo AbstractModel object: # Import s from coopr . pyomo import # Creat e t he model ob j e c t 11 model = AbstractModel ( ) This import assumes that Pyomo is present in the users Python path (see standard Python documentation for further details about the PYTHONPATH environment variable); the virtual Python executable installed by the Coopr installation script described in Section 9 automatically includes all requisite paths. Next, we create the sets and parameters that correspond to the data declarations used in the AMPL model. This can be done very intuitively using the Set and Param classes dened by Pyomo: # Set s model . P = Set ( ) # Parameters model . a = Param( model . P) model . b = Param( ) model . c = Param( model . P) model . u = Param( model . P) Note that the parameter b is a scalar, while parameters a, c, and u are arrays indexed by the set P. Further, we observe that all AML components in Pyomo (e.g., parameters, sets, variables, constraints, and objectives) are explicitly associated with a particular model. This allows Pyomo to automatically manage the naming of AML components, and multiple Pyomo models can be simultaneously dened and co-exist within a single application. Next, we dene the decision variables in the model using the Pyomo Var class: # Var i abl es model . X = Var ( model . P) Model parameters and decision variables are used in the denition of the objectives and constraints in the model. Parameters dene constants, and the variable values are determined via optimization. Parameter values can be dened in a separate data le that is processed by Pyomo (see Section 6.5), similar to the paradigm used in AMPL and AIMMS. Objectives and constraints are explicitly dened expressions in Pyomo. In abstract models, the Objective and Constraint classes require a rule keyword option that species how these expressions are to be constructed. A rule is a function that takes one or more arguments and returns an expression that denes the constraint (body and bounds) or objective (expression). The last argument in a rule is the model contain- ing the corresponding objective or constraint, and the preceding arguments are index values for the objective or constraint that is being dened. If only a single argument is supplied, the constraint and objectives are necessarily singletons, i.e., non-indexed. Using these constructs, we express the AMPL objective and constraints in Pyomo as follows (Note: The backslash (\) is the Python line continuation character): # Ob j e c t i v e def o bj e c t i v e r ul e ( model ) : return summation( model . c , model . X) model . Tot al Pr of i t = Obj ect i ve ( r ul e=o bj e c t i ve r ul e , \ 12 s ens e=maximize ) # Time Const rai nt def t i me r ul e ( model ) : return summation( model . X, denom=model . a ) <= model . b model . Time = Cons t r ai nt ( r ul e=t i me r ul e ) # Li mi t Const rai nt def l i mi t r u l e ( j , model ) : return ( 0 , model . X[ j ] , model . u [ j ] ) model . Li mi t = Cons t r ai nt ( model . P, r ul e=l i mi t r u l e ) This example illustrates several conventions for generating constraints using stan- dard Python language constructs. The function objective rule returns an algebraic expression that denes the objective and the function time rule returns a less-than-or- equal expression that denes an upper bound on the constraint body. They are created with Pyomos summation function, which concisely species a vector sum of one or more arguments. An alternative is to use the Python sum function. So, for example, the sum over i in model.P of X i /a i could be done using the summation function as shown or us- ing sum(model.X[j]/model.a[j] for j in model.P), which has the same eect. The function limit rule illustrates another convention that is supported by Pyomo; a rule can return a tuple that denes the lower bound, constraint body, and upper bound for a constraint. The Python value None can be supplied as one of the limit values if a bound is not enforced. Once an abstract Pyomo model has been created, it can be printed as follows: model . ppr i nt ( ) This command summarizes the information in the Pyomo model, but it does not print out explicit expressions. This is due to the fact that an abstract model needs to be in- stantiated with data to generate the variables, constraints, and objectives. An instance of the prod model can be generated and displayed as follows: i ns t anc e = model . c r e at e ( prod . dat ) i ns t anc e . ppr i nt ( ) The le prod.dat le contains set and param data commands that are consistent with AMPL data commands; this example le is also available from www.ampl.com/BOOK/ EXAMPLES/EXAMPLES1. Once a model instance has been constructed, an optimizer can be applied to nd an optimal solution. For example, the CPLEX mixed-integer programming solver can be used within Pyomo as follows: from coopr . opt import opt = Sol ver Fact or y ( cpl ex ) r e s u l t s = opt . s ol ve ( i ns t anc e ) This code fragment imports the solver interfaces associated with Pyomo, and creates an optimizer interface for the CPLEX executable. The Pyomo model instance is optimized, and the optimizer returns an object that contains the solution(s) generated during 13 optimization. Note that this optimization process is executed using other components of the Coopr library. The coopr.opt package manages the setup and execution of optimizers, and Coopr optimization plugins are used to manage the execution of specic optimizers. Finally, the results of the optimization can be displayed simply by executing the following command: r e s u l t s . wr i t e (num=1) Here, the num option indicates the maximum number of solutions from the results object to be written; some solvers (including CPLEX) can return multiple solutions that represent alternative optima, or other feasible points. 6.2 Advanced Pyomo Modeling Features The previous example provides a simple illustration of Pyomos symbolic modeling capabilities. Much more complex models can be developed using Pyomos exible modeling components. For example, multi-dimensional set and parameter data can be naturally represented using Python tuple objects. The dimen option species the dimensionality of set or parameter data, e.g., as follows: model . p = Param( model . A, dimen=2) In this example, the p parameter is an array of data values indexed by A for which each value is a 2-tuple. Additionally, set and parameter components can be directly constructed with data using the initialize option. In the simplest case, this option species data that is used in the component: model . s = Set ( i n i t i a l i z e =[ 1 , 3 , 5 , 7] ) For arrays of data, a Python dictionary can be specied to map data index values to set or parameter values: model . A = Set ( i n i t i a l i z e =[ 1 , 3 , 5 , 7] ) model . s = Set ( model . A, i n i t i a l i z e ={1: [ 1 , 2 , 3] , 5 : [ 1 ] } ) model . p = Param( model . A, i n i t i a l i z e ={1: 2 , 3: 4 , 5: 6 , 7: 8}) More generally, the initialize option can specify a function that returns data values used to initialize the component: def x i n i t ( model ) : return [ 2 i for i in range ( 0 , 10) ] model . x = Set ( i n i t i a l i z e=x i n i t ) Set and parameter components also include mechanisms for validating data. The within option species a set that contains the corresponding set or parameter data, for example as follows: model . s = Set ( wi t hi n=I nt e ge r s ) 14 model . p = Param( wi t hi n=model . s ) The validate option can also specify a function that returns a boolean indicating whether data is acceptable: def p val i d ( val , model ) : return val >= 0. 0 model . p = Param( val i dat e=p val i d ) Pyomo includes a variety of virtual set objects that do not contain data but which can be used to perform validation, including: Any - Any numeric or non-numeric value other than the Python None value PositiveReals - Positive real values NonNegativeIntegers - Non-negative integer values Boolean - Zero or one values Finally, there are many contexts where it is necessary to specify index sets that are more complex than simple product sets. To simplify model development in these cases, Pyomo supports set index rules that automatically generate temporary sets for indexing. For example, the following example illustrates how to index a variable on 2-tuples (i, j) for which i < j: model . n = Param( wi t hi n=Po s i t i ve I nt e g e r s ) def x i ndex ( model ) : return [ ( i , j ) for i in range ( 0 , model . n . val ue ) for j in range ( 0 , model . n . val ue ) i f i <j ] model . x = Var ( x i ndex ) 6.3 Constructing Concrete Models Concrete models are the most direct type of model that Pyomo supports. These mod- els can be formulated in the same way as abstract models using Pyomo components for variables, objectives, and (optionally) constraints. Additionally, a user can eas- ily leverage native Python data structures while constructing concrete models. The ConcreteModel class is used to represent concrete models whose data is provided as the model components are declared. In contrast to AbstractModel, a ConcreteModel immediately initializes model components as they are added to a instance. 6.4 Non-Linear Modeling Extensions A key dierentiating feature of Pyomo relative to other open-source (and some commer- cial) modeling languages is the ability to specify and generate non-linear optimization models. Currently, the non-linear model output format supported by Pyomo is NL, the native AMPL le format. We have chosen this format principally because it is used by a variety of non-linear solvers, including Ipopt [29] and BONMIN [8]. Further, the NL format supports the full range of non-linear operators. Examples of such operators, also provided in Pyomo, include: 15 pow, exp, and related operators (including sqrt and log10) sin, cos, tan, and related hyperbolic and inverse operators abs, the absolute value operator To illustrate the use of non-linear operators in Pyomo, consider the objective associated with the brownden CUTEr test problem instance [13]: def f ( model ) : expa = sum( [ ( ( model . x[ 1] +model . t [ i ] model . x[ 2] \ exp ( val ue ( model . t [ i ] ) ) ) 2 + \ ( model . x[ 3] +model . x [ 4 ] s i n ( val ue ( model . t [ i ] ) ) \ cos ( val ue ( model . t [ i ] ) ) ) 2 )2 for i in model . St ] ) return expa model . f = Obj ect i ve ( r ul e=f , s ens e=mi ni mi ze ) As the example illustrates, non-linear operators are naturally incorporated into the syntax for specifying algebraic expressions in both constraints and objectives. 6.5 Importing Data While commercial AMLs provide streamlined functionality for accessing external data sources, the same is not true of open-source AMLs. As discussed below in Section 6.7, Python-based AMLs other than Pyomo require the user to specify data via native Python constructs, e.g., in dictionaries or low-level input/output routines. The same holds for non-Python open-source AMLs such as FlopC++. While exible, this inter- face can be cumbersome and inecient for non-programmers. In the case of abstract models, where data varies across a single core abstraction, users must maintain inde- pendent data stores for each instance. Further, in deployed optimization applications, data is almost exclusively maintained in central data stores, e.g., Excel or a database. To address these concerns, Pyomo provides streamlined initialization of data from ex- ternal data sources. The data le format discussed in Section 6.1 is one example of an external method by which Pyomo data can be initialized, and is the most common mechanism for novice and academic Pyomo users. These les contain custom Pyomo data commands that are similar to those used by AMPL. To support data access from more general sources such as structured ASCII le, csv les, spreadsheets, and databases, Pyomo supports the import command. The import command directs Pyomo to load a relational table that represents set and parameter data. A relational table consists of a matrix of numeric string values, simple strings, and quoted strings. All rows and columns must have the same number of entries, and the rst row represents labels for the column data. We now discuss some aspects of the Pyomo import syntax using small illustrative examples. First, consider data stored in a simple text white-space delimited text le called Y.tab: A Y A1 3. 3 A2 3. 4 A3 3. 5 16 This le species the values of a parameter Y, which is indexed by set A. The following import command loads the parameter data: import Y. tab : [ A] Y ; The rst argument is the name of the le containing the data. The options after the : character indicate how the table data is mapped onto model data. The option [A] indicates that the set A is used as the index, and the option Y indicates the name of the parameter to be initialized. Similarly, set data can be loaded from an ASCII le named A.tab containing a single column: A A1 A2 A3 The format option must be specied to denote the fact that the relational data is to be interpreted as a set, as follows: import A. tab f ormat=s e t : A ; Simple extensions of this general syntax can be used to specify data tuples, or indexed sets; additional keywords are provided to guide how relational table data is interpreted by Pyomo. The same general syntax can be used to access relational table data in csv (comma-separated) and xls (Excel) les. Similarly, data can be extracted from relational databases by adding appropriate keyword specifying the database driver, query, and optional user name and password, for example: import ABCD. mdb us i ng=pyodbc query= SELECT FROM ABCD : \ Z=[A, B, C] Y=D ; This command creates a relational table using all of the columns in the database table ABCD, extracted using the pyodbc database interface. The using keyword indicates that data is to be extracted using an external Python package, the Python ODBC database interface. Other data partitioning and organizational commands are also supported in Pyomo, e.g., include statements that import nested .dat les and namespace commands that logically partition the contents of a single .dat le. Namespaces are particularly useful, allowing for the specication of data for multiple model instances in a single le. 6.6 The Pyomo Command While Pyomo-based Python code can be entered and executed directly from within the Python interpreter, Pyomo includes the pyomo command-line tool that can construct an abstract or concrete model, create a concrete instance from user-supplied data (if applicable), apply an optimizer, and summarize the results. For example, the following command line optimizes the AMPL prod model using the data in prod.dat: pyomo prod . py prod . dat 17 The pyomo command automatically executes the following steps: Create an abstract or concrete model Read the instance data (if applicable and specied) Generate a concrete instance from the abstract model and instance data (if appli- cable) Apply simple preprocessors to the concrete instance Apply a solver to the concrete instance Load the results into the concrete instance Display results The pyomo script supports a variety of command-line options to guide and provide information about the optimization process; documentation of the various available options is obtained by specifying the --help option. Options can control how much or even if debugging information is printed, including logging information generated by the optimizer and a summary of the model generated by Pyomo. Further, Pyomo can be congured to keep all intermediate les used during optimization, which facilitates debugging of the model construction process. 6.7 Comparison with Other Python-Based Modeling Languages We now briey contrast Pyomo with several open-source Python-based math program modeling packages providing related functionality. A comprehensive comparison is be- yond the scope of this paper. Instead, our goal here is to briey outline the primary dierences in the design and functionality of these packages. We focus rst and in most detail on a comparison of Pyomo and PuLP, as the latter is arguably the most widely used open-source AML. PuLP The PuLP package [40] is a widely used, light-weight Python package for ex- pressing and solving optimization problems. The biggest functional dierences between PuLP and Pyomo are: (1) PuLP can only express concrete model instances, (2) PuLP only allows for expression of linear and mixed-integer models, and (3) PuLP provides no built-in mechanism (other than what is available through Python) to load and save model data from external sources, e.g., text les, spreadsheets, and databases. As dis- cussed previously, the availability of these features is either critical for or extremely useful in the deployment of solutions to real-world applications. For concrete model specication, the PuLP and Pyomo syntax are qualitatively similar. However, some dierences are worth briey highlighting. Consider the following PuLP model: from pul p import prob = LpProblem( A s mal l concr et e model , LpMaximize ) x1 = LpVari abl e ( x1 , 5 , 10) # 5 <= x1 <= 10 x2 = LpVari abl e ( x2 , 10 , 30) # 10 <= x1 <= 30 prob += x1+x2 , The Obj ect i ve prob += 0. 5 x1 + 1. 5 x2 <= 50 , The Cons t r ai nt In this example, the variables x1 and x2 are not explicitly associated with the model. They can be accessed by scanning the set of model constraints, but this is in practice 18 expensive. In Pyomo, all model components (parameters, sets, variables, etc.) are con- sistently and explicitly associated with a specic model, and can be directly accessed by querying the model in a clean, object-oriented fashion. The mechanisms for expressing indexed components in particular variables and constraints are more natural and exible in Pyomo than in PuLP. Indexed variables in PuLP are specied by creating a dictionary (mapping of keys to values) of variables, e.g., as follows: i ndex a = [ 1 , 2 , 3] i ndex b = [ 4 , 5 , 6] i ndexed var = LpVari abl e . di c t s ( my vars , \ ( i ndex a , i ndex b ) , 0 , 1 , LpI nteger ) This construct is expressed in Pyomo as follows: i ndex a = [ 1 , 2 , 3] i ndex b = [ 4 , 5 , 6] model . i ndexed var = Var ( i ndex a , i ndex b , wi t hi n=Bi nary ) One minor limitation of the PuLP indexed variable syntax is that variable lower and upper bounds are assumed to be homogeneous. In contrast, Pyomo provides for index- specic bounds by supplying an arbitrary function via the bounds keyword in the Var constructor. Next, we consider an example in which constraints are imposed on a variable value on a per-index basis. In PuLP, an example of such a constraint is as follows: for x in i ndex a : for y in i ndex b : prob += i ndexed var [ x , y ] <= \ some param [ x , y ] ot he r var [ x , y ] , Note that the looping constructs must be manually specied, and that the constraints are not logically grouped within the model. In Pyomo, this constraint would be ex- pressed as follows: def s ome r ul e ( x , y , model ) : return model . i ndexed var [ x , y ] <= \ model . some param [ x , y ] model . ot he r var [ x , y ] model . my cons t r ai nt = Cons t r ai nt ( i ndex a , i ndex b , \ r ul e=s ome r ul e ) Contrasting the two approaches, we observe that the related constraints are automat- ically grouped in Pyomo, and the explicit (and potentially error-prone) looping con- structs in PuLP are avoided. Similarly, constraint names are assigned automatically in Pyomo (e.g., my constraint[1,3]), whereas the PuLP looping constructs require the user to explicitly form the constraint name (which is left unspecied for this reason in our example). PuLP is actively supported, and is distributed under an open, BSD-like license. PuLP was recently accepted into the COIN-OR project. 19 APLEpy The APLEpy package [5] is similar to PuLP in both design and function- ality, although there are some dierences worth highlighting. In contrast to PuLP, all model components are implicitly members of a single, globally accessible model object. As a consequence, maintaining and manipulating multiple model instances (e.g., as shown for Pyomo in Section 8) is not possible. The syntax for constructing constraints is very similar to that of PuLP. For example, explicit user looping is required in the case of indexed constraints. Like PuLP, APLEpy only supports concrete model speci- cation, provides no integrated facilities for accessing external data, and does support specication of indexed variables and constraints. Although still available for down- load, APLEpy does not appear to be actively supported. APLEpy is distributed under the Common Public License. PyMathProg and POAMS Other than PuLP and APLEpy, POAMS and PyMath- Prog are the two most similar open-source AMLs to Pyomo. Unfortunately, POAMS is not widely available for download and experimentation (there has been no formal release). PyMathProg [41] is associated with the GNU GLPK project, and as a conse- quence, only interfaces to the GLPK solver. Models are specied in a PuLP-like syntax; no functionality for specifying abstract models is provided, nor are integrated interfaces to external data sources. PyMathProg is distributed under the GPL license. Numberjack and Google OR-Tools A number of Python modeling and solver pack- ages have recently originated from the constraint programming community, in partic- ular Google OR tools [37] and Numberjack [28]. The goals of Numberjack are broader than Pyomo and related AMLs from the mathematical programming community, in that the objective is to support specication and solution of constraint programming, mixed-integer linear programming, and boolean satisability models. Consequently, a dierent set of solvers is supported, e.g., the SCIP mixed-integer solver, the Mistral constraint programming solver, and the MiniSat satisability solver. Like the other open-source AMLs discussed above, models may not be specied symbolically, and there is no integrated support for data input from external sources such as Excel. Non-linear operators for mathematical programming are not supported. The Number- jack syntax is very similar to that of PuLP, e.g., models are constructed using the += syntax and variables are specied externally to a model object. Numberjack is dis- tributed under the LGPL. The solver interfaces in Numberjack are similar in design spirit to those in Coopr. In contrast to Numberjack, Googles OR-tools package is pri- marily focused at the present time on specifying and solving constraint programs, and supports a single integrated solver. Like Numberjack, the OR-tools package does not allow specication of abstract models, and lacks integrated support for obtaining data from external data sources. Googles OR-tools package is distributed under the Apache license, while Numberjack is licensed under the LGPL. 6.8 Run-Time Performance Run-time performance was not the most signicant factor in our choice of Python as a modeling language, but it is clearly an important factor for many Pyomo users. Al- though the optimization solver run-time is typically dominant, in some applications the time required to construct complex models can be nontrivial. Thus, we have simplied and tuned the behavior of Pyomo modeling components in an attempt to minimize 20 Fig. 1 The real (wall clock) run time required to construct and write dense p-median model instances to an NL le with AMPL, Pyomo, and Pyomo with garbage collection disabled. the run-time required for model construction. This tuning eort is on-going, driven by experience with specic test applications. Figure 1 shows the run-time performance for Pyomo and AMPL on dense instances of the well-known p-median facility location problem. We have considerable experi- ence solving large-scale p-median problem instances arising in water security contexts [27], motivating our investigation of run-time scaling using this particular optimization problem. The three curves in this plot show the time, in seconds, that was required to generate and write (but not solve) a model with AMPL, with Pyomo, and with Py- omo when Pythons garbage collection mechanism was disabled. In all three cases, we write the resulting mixed-integer program in the NL problem format, i.e., the preferred format in AMPL. The x-axis captures the problem size N M for p-median problems where the number of customers, N, equals the number of facilities, M. Each point in this plot represents the average of 15 dierent trials, executed on a modern (2.93GHz Intel Xeon) multi-core workstation running Linux. This gure shows that there is a substantial performance gap between Pyomo and AMPL for large p-median instances. For the largest problems, Pyomo with garbage collection enabled (the default) was 62 times slower than AMPL, and 32 times slower when garbage collection is disabled. This dierence largely stems from the fact that Python is notoriously slow at constructing and destructing independent of garbage collection large numbers of small objects, e.g., such as parameter and variable val- ues appearing in expressions. Constraint and objective expressions in Pyomo are en- coded as trees, which represent each arithmetic operator and data value with a distinct Python object. Thus, constructing large expression trees involves the construction of many Python objects, yielding the observed performance dierences. We are currently 21 exploring alternative encodings of expression trees, specically to avoid this source of performance degradation. Other sources of performance bottlenecks include string (component label) manipulation, expression preprocessing in preparation to write the NL le, and expression verication and simplication. Our analysis of Pyomo proler output suggests that these bottlenecks can also be mitigated with a moderate develop- ment eort. Finally, we observe that the performance discrepancies vary signicantly, depending strongly on the model structure and the comparative baseline. For example, Dimitrov observed a 6x discrepancy between the performance of GAMS and Pyomo on a complex production/ transportation model [15]. The disabling of garbage collection was originally suggested by the experimental performance analysis of APLEpy [32]. Although this approach makes sense for simple analyses, this option is not broadly recommended, e.g, in the case of Pyomo scripts that generate many problem instances; machine memory limits can quickly be reached. The APLEpy experiments also employ the Psyco JIT compiler [39] to further improve APLEpy performance. However, we omit Psyco experiments with Pyomo since it can only be used on 32-bit computers, which are no longer commonly available. 7 The Coopr Optimization Package Much of Pyomos exibility and extensibility is due to the fact that Pyomo is inte- grated into Coopr, a COmmon Optimization Python Repository [12]. Coopr utilizes a component-based software architecture to dene plugins that modularize many as- pects of the optimization process. This allows Coopr to support a generic optimization process. Coopr components manage the execution of optimizers, including run-time detection of available optimizers, conversion to le formats required by an optimizer, and transparent parallelization of independent optimization tasks. 7.1 Generic Optimization Process Pyomo strongly leverages Cooprs ability to execute optimizers in a generic manner. For example, the following Python script illustrates how an optimizer is initialized and executed with Coopr: opt = Sol ver Fact or y ( sol ver name ) r e s u l t s = opt . s ol ve ( c onc r e t e i ns t anc e ) r e s u l t s . wr i t e ( ) This example illustrates Cooprs explicit segregation of problems and solvers into sep- arate objects. Such segregation promotes the development of tools like Pyomo that dene optimization applications. The results object returned by a Coopr optimizer includes information about the problem, the solver execution, and one or more solutions generated during optimization. This object supports a general representation of optimizer results that is similar in spirit to the results encoding scheme used by the COIN-OR OS project [20]. The main dierence is that Coopr represents results in YAML, a data serialization format that is both powerful and human readable [55]. For example, Figure 2 shows the results output after solving the AMPL example production planning problem described in Section 6. 22 # ========================================================== # = Sol ver Res ul t s = # ========================================================== # # Problem Inf ormat i on # Problem : Lower bound : i nf Upper bound : 192000 Number of o bj e c t i ve s : 1 Number of c o ns t r a i nt s : 6 Number of va r i a bl e s : 3 Number of nonzeros : 7 Sense : maximize # # Sol ver Inf ormat i on # Sol ver : St at us : ok Termi nati on c ondi t i on : OK Error r c : 0 # # Sol ut i on Inf ormat i on # Sol ut i on : number of s o l ut i o ns : 1 number of s o l ut i o ns di s pl ayed : 1 Gap: 0. 0 St at us : opti mal Obj ect i ve : f : Id : 0 Value : 192000 Var i abl e : X[ bands ] : Id : 0 Value : 6000 X[ c o i l s ] : Id : 1 Value : 1400 Cons t r ai nt : c u Li mi t [ bands ] : Id : 1 Dual : 4 c u Ti me : Id : 4 Dual : 4200 Fig. 2 Results output for the production planning model described in Section 6. 23 7.2 Solver Parallelization Coopr includes two components that manage the execution of optimization solvers. First, Solver objects manage the local execution of an optimization solver. For exam- ple, Coopr includes plugins for MIP solvers like CPLEX, GUROBI, and CBC. Second, SolverManager objects coordinate the execution of Solver objects in dierent environ- ments. The API for solver managers supports asynchronous execution of solvers as well as solver synchronization. Coopr includes solver managers that execute solvers serially, in parallel on compute clusters, and remotely with the NEOS optimization server [16]. Cooprs Pyro solver manager supports parallel execution of solvers using two key mechanisms: the standard Python pickle module and the Pyro distributed computing library [43]. The pickle module performs object serialization, which is a pre-requisite for distributed computation. With very few exceptions, any Python object can be pickled for transmission across a communications channel. This includes simple, built- in objects such as lists, and more complex Pyomo objects like AbstractModel and ConcreteModel instances. For example, the following code fragment illustrates the use of the pickle module to write and restore a Pyomo model instance via an intermediate le: import pi c kl e i ns t anc e = model bui l der ( ) # f unct i on t o b ui l d Pyomo i ns t ance pi c kl e . dump( i ns t ance , open ( t e mpf i l e , wb ) ) i ns t ance copy = pi c kl e . l oad ( open ( t e mpf i l e , r ) ) The simplicity of complex object serialization in Python is remarkable, especially rela- tive to low-level languages like C++ in which users must develop complex class-specic methods to achieve equivalent functionality. This issue is amplied by the presence of complex inter-relationships among objects, i.e., such as those present in Pyomo models. Pyro (Python Remote Objects) [43] is a mature third-party Python library that provides an object-oriented framework for distributed computing that is similar to Re- mote Procedure Calls. Pyro is cross-platform, such that dierent application compo- nents can execute on fundamentally dierent platform types, e.g., Windows and Linux. Inter-process communication is facilitated through a standard name server mechanism, and object serialization and de-serialization is performed via the Python pickle mod- ule. The standard Coopr distribution includes both the Pyro library and a number of solver-centric utilities to facilitate parallel solution of Pyomo models. All communica- tion is established through the Coopr name server, invoked via the coopr-ns script on some host node in a distributed system. A router process is then launched on a compute node via the Coopr dispatch srvr script. Finally, one or more solver pro- cesses are launched on various compute nodes via the Coopr pyro mip server script. Each solver process identies a dispatch server through the name server and noties the dispatch server that it is available for solving instances. Note that the various solver processes can execute on distinct cores of a single workstation, across multi- ple workstations, and even across multiple workstations running dierent operating systems, e.g., hybrid Windows/Linux clusters. Communication with the global name server is typically accomplished by setting the PYRO NS HOSTNAME environment variable to identify the name server host; in a non-distributed (e.g., SMP) environment, such communication is automatic. 24 Once initialized, the distributed solver infrastructure is accessed by the Pyro solver manager, which can be constructed using the SolverManagerFactory functionality: s ol ver manager = Sol verManagerFactory ( pyro ) The Pyro solver manager identies dispatch servers through the Coopr name server. Thus, a simple change in the argument name to the solver manager factory is sucient to access distributed solver resources in Coopr. We note that if no solver manager is specied (as is the case for examples shown early in this paper), a default serial solver manager is constructed automatically; this particular manager executes solves locally, in a sequential fashion. Given a Pyro solver manager, a Pyomo instance can be solved via the following code: i ns t anc e = model bui l der ( ) # f unct i on t o b ui l d Pyomo i ns t ance ah = s ol ver manager . queue ( i ns t anc e ) s ol ver manager . wa i t f or ( ah) r e s u l t s = s ol ver manager . g e t r e s u l t s ( ah) The ah object is known as an action handle, and informally serves as a tracking number through which users can interrogate the status of a submitted solve request. Figure 3 illustrates a typical architecture for distributed solver computation in Coopr. In this example, the user executes a solver script (e.g., the runbenders script described in Section 8.2) on his or her local machine. The name and dispatch server processes (coopr-ns and dispatch srvr) are congured as daemons on an administra- tive workstation, such that they are persistently available. This example network has two major compute resources: a Windows cluster and a multi-core Linux server. The Linux server has four CPLEX licenses, while the Windows cluster has the open-source CBC solver available on all of its 64 compute nodes. Four pyro mip server daemons are executing on the server (each is allocated two cores), while a pyro mip server daemon is executing on each compute node of the Windows cluster. In this particular example, all user-solver communication is performed via the sole dispatch server; in general, multiple dispatch servers can be congured to mitigate communication bottlenecks. 7.3 Solver Plugins A common object-oriented characteristic of open-source optimization software is the ability to use classes and class inheritance to develop extensible functionality. In con- trast, Coopr leverages the PyUtilib Component Architecture (PCA) to separate the declaration of component interfaces from their implementation [46], through a mecha- nism referred to as a plugin. A plugin species the required method and data associated with a specic interface, but does not require that all instances of this interface be de- rived from a particular base class. For example, in Coopr all optimization solvers are implemented by dening a distinct class encapsulating the associated functionality. However, these classes are not required to be sub-classes of a solver interface class. Instead, they are simply required to provide the same interface methods and data. Coopr uses plugin components to modularize the process workow commonly as- sociated with optimization. A component is a software package, module, or object that provides a specic functionality. Plugin components augment the standard optimization 25 Fig. 3 An example of a hybrid compute architecture and how it can be congured for dis- tributed solves using Pyro and Coopr. workow by implementing functionality that is exercised on demand. Component- based software with plugins is a widely recognized best practice for extending and evolving complex software systems in a reliable manner [48]. Component-based frame- works manage the interaction between components to promote adaptability, scalability, and maintainability in large software systems [51]. For example, with component-based software there is much less need for major releases because software changes can be encapsulated within individual components. Component architectures also encourage third-party developers to add value to software systems without risking destabilization of the core functionality. Coopr uses the PCA to dene interfaces for the following plugin components: Solvers, which perform optimization Solver managers, which manage the execution of solvers Converters, which translate between dierent optimization problem le formats Solution readers, which load optimization solutions from solver output les Problem writers, which create solver input les that specify optimization problems Transformations, which generate new models via reformulation of existing models Coopr also contains Pyomo-specic components for preprocessing Pyomo models before they are solved. Coopr includes a variety of plugins that implement these component interfaces, many of which rely on third-party software packages to provide key functionality. For example, solver plugins are available for the CPLEX, GUROBI, CBC, PICO, and GLPK mixed-integer linear programming solvers. These plugins rely on the availabil- 26 ity of binary executables for these solvers, which need to be installed separately. Sim- ilarly, Coopr includes plugins that convert between dierent solver input le formats; these plugins rely on binary executables built by the GLPK [23] and Acro [1] software libraries. Figure 4 illustrates the denition of a solver plugin that can be directly used by the pyomo command; this example is available in the standard Coopr distribution, in the directory coopr/examples/pyomo/p-median. The MySolver class implements the IOptSolver interface, which declares this class as a Coopr solver plugin. This plugin im- plements a solve method, which randomly generates solutions to a p-median optimiza- tion problem. The only other step required is to invoke Cooprs SolverRegistration function, which associates the solver name, random, with the plugin class, MySolver. Importing the Python module containing MySolver activates this plugin; all other registration is automated by the PCA. For example, if this plugin is contained within the le solver2.py, then the following Python script can be used to apply this solver to Cooprs p-median example model: import coopr . opt import pmedian import s ol ve r 2 i ns t anc e=pmedian . model . c r e at e ( pmedian . dat ) opt = coopr . opt . Sol ver Fact or y ( random ) r e s u l t s = opt . s ol ve ( i ns t anc e ) print r e s u l t s The pyomo script can also be used to apply a custom optimizer in a natural manner. The following command-line is used to solve the Cooprs p-median example with the CBC mixed-integer programming solver: pyomo s o l ve r=cbc pmedian . py pmedian . dat Applying the custom solver simply requires the specication of the new solver name, random, and an indication (via the load option) that the solver2.py le should be imported before optimization: pyomo s o l ve r=random l oad=s ol ve r 2 . py pmedian . py \ pmedian . dat Thus, users can develop custom solvers in Python modules, which can be executed and tested directly using the pyomo command. This example serves to illustrate the ease with which new solver interfaces can be prototyped and deployed using Pyomo and Coopr. Similar plugins could be easily implemented for a wide range of metaheuristic or problem-specic solvers; the user simply needs to extract the necessary information from a Pyomo instance, transfer it to the specic solver, and then extract and store the results of the solve in the canonical Coopr results format. Further, such plugins can be seamlessly integrated into existing optimization workows, e.g., via the pyomo command line utility. 27 # Imports from Coopr and PyUt i l i b from coopr . pyomo import from pyut i l i b . pl ugi n . cor e import from coopr . opt import import random import copy cl ass MySolver ( obj e c t ) : # Decl are t hat t h i s i s an I Opt Sol ver pl ugi n i mpl ements ( I OptSol ver ) # Sol ve t he s p e c i f i e d probl em and ret urn # a Sol ver Res ul t s ob j e c t def s ol ve ( s e l f , i ns t ance , kwds ) : print St ar t i ng random h e u r i s t i c val , s o l = s e l f . random( i ns t anc e ) n = val ue ( i ns t anc e . N) # Setup r e s u l t s r e s u l t s = Sol ve r Re s ul t s ( ) r e s u l t s . problem . name = i ns t anc e . name r e s u l t s . problem . s ens e = Probl emSense . mi ni mi ze r e s u l t s . problem . num cons t r ai nt s = 1 r e s u l t s . problem . num vari abl es = n r e s u l t s . problem . num obj ect i ves = 1 r e s u l t s . s o l ve r . s t at us = Sol ver St at us . ok s ol n = r e s u l t s . s ol ut i on . add ( ) s ol n . val ue = val s ol n . s t at us = Sol ut i onSt at us . f e a s i b l e for j in range ( 1 , n+1): s ol n . var i abl e [ i ns t anc e . y [ j ] . name ] = s o l [ j 1] # Return r e s u l t s return r e s u l t s # Perform a random search def random( s e l f , i ns t anc e ) : s o l = [ 0 ] val ue ( i ns t anc e . N) for j in range ( 0 , val ue ( i ns t anc e . P) ) : s o l [ j ] = 1 # Generate 100 random s ol ut i ons , and keep t he b e s t bes t = None be s t s o l = [ ] for kk in range ( 100) : random. s h u f f l e ( s o l ) # Compute val ue val =0.0 for j in range ( 1 , val ue ( i ns t anc e .M)+1): val += min( val ue ( [ i ns t anc e . d [ i , j ] ) for i in range ( 1 , val ue ( i ns t anc e . N)+1) i f s o l [ i 1] == 1 ] ) i f bes t i s None or val < bes t : bes t=val be s t s o l=copy . copy ( s o l ) return [ best , be s t s o l ] # Regi s t er t he s ol v e r wi t h t he name random Sol ve r Re gi s t r at i on ( random , MySolver ) Fig. 4 A simple customized Coopr solver for p-median problems. 28 8 Advanced Scripting and Algorithm Development In Section 6, we presented a straightforward use of Pyomo: to construct a concrete instance from an abstract model and a data le, and to subsequently solve the instance using a specic solver plugin. A generic optimization process is executed by the pyomo command, so the typical user does not need to understand the details of the function- ality present in most Pyomo and Coopr libraries. However, this command masks much of the power underlying Pyomo and Coopr, and it limits the degree to which Pythons rich set of libraries can be leveraged. An important consequence of the Python-based design of Pyomo and its integration with the Coopr environment is that modularity is fully supported over a range of ab- stractions. At one extreme, model elements can be manipulated explicitly by specifying their names and the values of their indexes. This sort of reference can be made more abstract, as is the case with other AMLs, by specifying various types of named sets and parameters so that the dimensions and details of the data can be separated from the specication of the model. Separation of an abstract model from the data specication is a hallmark of structured modeling techniques [22]. At the other extreme, elements of a optimization model can be treated in their fully canonical form as is supported by callable solver libraries. Methods can be written that access or manipulate, for example, objective functions or constraints in a fully general way. This capability is a fundamen- tal tool for general algorithm development and extension [33]. Pyomo provides the full continuum of abstraction between these two extremes to support modeling, scripting, and algorithm development. Furthermore, methods are extensible via overloading of all dened operations. Both modelers and developers can alter the behavior of a package or add new functionality. In the remainder of this section, we introduce a variety of example case studies high- lighting the relative ease of both scripting and algorithm development in Coopr. The example in Section 8.1 discusses a script implementing a hybrid MIP-NLP optimization algorithm. Section 8.2 describes the Pyomo-based implementation of a Benders-based decomposition algorithm for a simple production planning problem. Together, these two examples illustrate the use of Pyomo and Coopr to solve optimization models requir- ing some degree of algorithmic customization, emphasizing dierent coding approaches to achieve similar goals. Finally, in Section 8.3 we introduce an example highlighting features of Pyomo and Python that facilitate the development of generic algorithms. 8.1 Advanced Scripting: Hybrid Optimization Hybrid methods may be required to solve particularly dicult real-world optimization problems. Due to the deviation from the standard optimization workow process, in which a model is handed to an o-the-shelf solver, implementation of hybrid methods typically requires non-trivial scripting. One instance of such a hybrid algorithm implemented in Coopr/Pyomo was de- veloped by our collaborators in Texas A&Ms Department of Chemical Engineering. The specic problem of interest involved the development of a global optimization al- gorithm to solve a parameter estimation problem arising in the context of a model for childhood infectious disease transmission. For further information regarding both the model and its Pyomo implementation we refer to [25]. Below, we briey survey the 29 high-level solution strategy, and highlight key fragments of Pyomo code implementing the strategy. The parameter estimation model considered is a dicult non-convex, non-linear program for which no ecient solution algorithm exists. Consequently, to facilitate tractable solution, the problem is reformulated using a MIP under-estimator and an NLP over-estimator. Information is exchanged between the two formulations, and the process is iterated until the two solutions converge. The main script for this process is implemented in Pyomo as follows; some initial- ization code is omitted for clarity: d a t a f i l e = di s e as e dat a . dat r e s u l t s f i l e = g l o b a l o p t r e s u l t s # t he f unct i on i n i t i a l i z e d i c t s i s a u t i l i t y s p e c i f i c t o # t h i s exampl e , which e x t r a c t s dat a from t he i nput f i l e and # var i ous i nput arguments ( not shown ) , and r et ur ns two Python # d i c t i o na r i e s . mdl i nputs , dat a i nput s = i n i t i a l i z e d i c t s ( da t a f i l e , . . . . ) for i in range ( 1 , MAX ITERS+1): # de f i ne t he f u l l opt i mi z at i on model f or t h i s i t e r a t i o n . # dat a i s s i g n i f i c a n t l y changi ng each i t e r a t i o n . . . mstr mdl = di s eas e mdl ( mdl i nputs , dat a i nput s ) # cr eat e and s ol v e MIP overes t i mat or . i ns t , MI P r es ul t s = sol ve MIP( mstr mdl , MIP opti ons ) # cr eat e and s ol v e t he NLP underes t i mat or . i ns t , NLP resul ts = sol ve NLP( mstr mdl , MI P resul ts , \ NLP opti ons ) # l oad r e s ul t s , r epor t s t at us , and compute t he gap/ub . GAP, UB = out put r e s ul t s ( i ns t , . . . , MI P resul ts , \ NLP resul ts , . . . ) # use r e s u l t s t o det ermi ne paramet ers f or t he next # i t e r at i on , vi a updat es t o t he mdl i nput s # di c t i onar y . mdl i nputs , POINTS ADDED = updat e poi nt s ( mdl i nputs , i ns t , \ . . . MI P r es ul t s ) i f (UB != None ) and ( i == 1 ) : # perf orm s o l v e s t o s t r engt hen model . mdl i nputs = t i ght en bounds ( i ns t , mdl i nputs , dat a i nput s , \ UB, num l b poi nts , MIP opti ons ) i f (POINTS ADDED == 0) or (GAP <= MAXGAP) : break 30 In this example, user-dened Python functions are used to organize and modularize the optimization workow. In contrast to examples shown previously, the optimization model in this case is constructed via a function (disease model). A fragment of code for this function is as follows: def di s eas e mdl (INPUTS, DATA) : model = ConcreteModel ( ) # at t ach nonPyomo dat a v al ue s t o t he model i ns t ance . model . pts LS = INPUTS[ LSP ] model . pt s LS l ower = INPUTS[ LSPL ] model . pt s LI = INPUTS[ LIP ] model . pt s LI l owe r = INPUTS[ LIPL ] model . TIME = Set ( ordered=True , i n i t i a l i z e=DATA[ TIME ] ) model . BIRTHS = Param( model . TIME, i n i t i a l i z e=DATA[ BIRTHS ] ) # more parameter and s e t d e f i n i t i o n s . . . model . l ogS = Var ( model . TIME, bounds=l ogS bounds r ul e ) model . l o g I = Var ( model . TIME, bounds=l ogI bounds r ul e ) # more model v a r i a b l e s . . . model . obj = Obj ect i ve ( r ul e=o bj r ul e ) model . pn con = Cons t r ai nt ( model . TIME, r ul e=pn r ul e ) # more model c ons t r ai nt s . . . # aut omat i c al l y generat e , vi a a f unct i on , add i t i onal # c ons t r ai nt s as s oc i at e d wi t h l i ne a r i z a t i o n and add # them t o t he model . . . l i ne a r i z e e x p ( model . TIME, model . S , model . l ogS , \ model . pts LS , model . pt s LS l ower ) l i ne a r i z e e x p ( model . TIME, model . I , model . l ogI , \ model . pts LI , model . pt s LI l owe r ) return model Beyond the standard Pyomo model constructs, we observe the ability to dynamically augment Pyomo models with arbitrary data, e.g., the denition of the pts LS and pts LI attributes; these are not model components, but rather raw Python data types. Pyomo itself is unaware of of these attributes, but other components of a user-dened script can access and manipulate these attributes. Such a mechanism is invaluable when information is being propagated across disparate components of a complex, multi- step optimization process. We note the use of an auxiliary function (linearize exp, 31 introduced to modularize the code) to construct specic classes of constraint, which are then added to the input Pyomo model. Following instance construction, MIP and NLP variants of the master model in- stance are solved using the user-dened utility functions solve MIP and solve NLP. These functions simply activate the relevant model components (all model components in Pyomo can be independently activated and deactivated), solve the corresponding model, and return the results. The function output results reports and caches results, and computes the current gap and upper bound. Finally, the functions update points and tighten bounds perform computations (obtained both analytically and via MIP solves) that yield the input data required for the next iteration of the process. This example serves to illustrate, at a very high level, the scripting of a complex hybrid optimization algorithm using Pyomo and Coopr. Despite the complexity of the process (hidden in large part due to code modularization), the code (including that for the model denition) is relatively compact a total of approximately 650 lines of Python code, including white-space. The full code is available upon request by contacting one of the authors. 8.2 Benders Decomposition To further illustrate advanced scripting in Pyomo, we next consider the translation of an AMPL example involving the solution via Benders decomposition of a production planning problem formulated as a stochastic linear program. This example empha- sizes dierent aspects of Pyomo and Coopr than the hybrid optimization discussed in Section 8.1, specically in terms of how models are dened, manipulated, and solved. The production planning problem considered is dened as follows. Given a num- ber of product types and associated production rates, production limits, and inventory holding costs, the objective is to determine a production schedule over a number of weeks that maximizes the expected prot over a range of anticipated revenue scenarios. The problem is formulated as a two-stage stochastic linear program; rst-stage deci- sions include the initial production, sales, and inventory quantities, while second-stage decisions (given scenario-specic revenues) include analogous parameters for all time periods other than the initial week. We refer the reader to Bertsimas and Tsitsiklis [6] for a discussion of how general two-stage stochastic linear programs can be solved via Benders decomposition. The original AMPL example consists of the three les stoch2.run (an AMPL script), stoch2.mod (an AMPL model le), and stoch.dat (an AMPL data le), which are available from http://www.ampl.com/NEW/LOOP2. The translation of this AMPL example into Pyomo illustrates many of the more powerful capabilities of Pyomo and Coopr, including dynamic construction of model variables and constraints, as well as parallelization of sub-problem solves. The codes for this example are available in the Coopr distribution, in the directory coopr/examples/pyomo/benders. The rst step in the translation process involves cre- ation of the master and sub-problem abstract models, mirroring the process previously documented in Section 6; the resulting models are captured in the les master.py and subproblem.py. We observe that AMPL allows pick-and-choose selection of compo- nents from a single model denition le to construct sub-models. In contrast, Pyomo requires that distinct models be self-contained in their denition. 32 The Python code to execute Benders decomposition for this particular example is found in the le runbenders; the remainder of this section will explore key aspects of this code in more detail. As with the basic Pyomo example introduced in Section 6, the runbenders script begins by loading the necessary components of the Pyomo, Coopr, PyUtilib, and Python system libraries: import s ys from pyut i l i b . mi sc import i mpo r t f i l e from coopr . opt . base import Sol ver Fact or y from coopr . opt . p a r a l l e l import Sol verManagerFactory from coopr . opt . p a r a l l e l . manager import s o l v e a l l i n s t a n c e s from coopr . pyomo import The need for and role of these various modules will be explained below. The rst executable component of the runbenders script involves construction of the abstract models and concrete problem instances for both the master and second- stage sub-problems: # i n i t i a l i z e t he master i ns t ance . mstr mdl = i mpo r t f i l e ( master . py ) . model ms t r i ns t = mstr mdl . c r e at e ( master . dat ) # i n i t i a l i z e t he subprobl em i ns t anc e s . sb mdl = i mpo r t f i l e ( subprobl em . py ) . model s ub i ns t s = [ ] s ub i ns t s . append ( sb mdl . c r e at e ( name=Base SubProblem , \ f i l ename= base subprobl em . dat ) ) s ub i ns t s . append ( sb mdl . c r e at e ( name=Low SubProblem , \ f i l ename=l ow subprobl em . dat ) ) s ub i ns t s . append ( sb mdl . c r e at e ( name=High SubProblem , \ f i l ename= hi gh subprobl em . dat ) ) In this code fragment, the master.py and subproblem.py model denition les are loaded (via the PyUtilib function import file), dening the associated abstract mod- els; the runbenders script accesses the corresponding model objects in the respectively Python modules. Given an abstract model object, a concrete instance can be created by invoking its create method supplied with an argument specifying a data le. For reasons discussed below, the second stage sub-problems are gathered into a Python list. Next, we create the necessary solver and solver manager plugins: # i n i t i a l i z e t he s o l v e r and s o l v e r manager . s o l ve r = Sol ver Fact or y ( cpl ex ) i f s o l ve r i s None : print A CPLEX s o l ve r i s not a va i l a bl e on t hi s machine . s ys . e xi t ( 1) s ol ver manager = Sol verManagerFactory ( s e r i a l ) # s e r i a l #sol ver manager = Sol verManagerFact ory ( pyro ) # p a r a l l e l 33 In this example, we use CPLEX to solve concrete instances, as it provides the vari- able and constraint suxes needed for the Benders procedure (e.g., reduced-costs, as discussed below). In Coopr, the solver manager is responsible for coordinating the ex- ecution of any solver plugins. The code fragment above species two alternative solver managers, with serial execution enabled by default; see Section 7.2 for a discussion of parallel solver execution (via the Pyro solver manager). Benders decomposition is an iterative process; sub-problems are solved, cuts are added to the master problem, and the master problem is (re-)solved; the process repeats until convergence. In the master.py model, the set of cuts and the corresponding constraint set is dened as follows: model . CUTS = Set ( wi t hi n=Pos i t i ve I nt e ge r s , ordered=True ) model . Cut Defn = Cons t r ai nt ( model . CUTS) Initially, the CUTS set is empty. Consequently, no rule is required in the denition of the constraint set Cut Defn. Similarly, the pricing (i.e., reduced-cost and dual) information from the sub-problems is stored in the following parameters within master.py (such pricing information is integral in the denition of Benders cuts [6]): model . t i me pr i c e = Param( model .TWOPLUSWEEKS, model . SCEN, \ model . CUTS, de f aul t =0. 0) model . ba l 2 pr i c e = Param( model .PROD, model . SCEN, model . CUTS, \ de f aul t =0. 0) model . s e l l l i m p r i c e = Param( model .PROD, model .TWOPLUSWEEKS, \ model . SCEN, model . CUTS, \ de f aul t =0. 0) Again, because the index set CUTS is empty, these parameter sets are initially empty. The parameters are indexed by sets whose declaration is not shown here: PROD is the set of products, SCEN is the set of scenarios, and TWOPLUSWEEKS is the set of weeks beginning with week 2. Given these denitions, we now examine the main loop in the runbenders script. The rst portion of the loop body solves the second stage sub-problems as follows: s o l v e a l l i n s t a n c e s ( sol ver manager , s ol ver , s ub i ns t s ) The function solve all instances is a Coopr utility that performs three distinct op- erations: (1) queue the sub-problem solves with the solver manager, (2) solve each of the sub-problems, and (3) load the results into the sub-problem instances. This func- tion encapsulates the detailed logic of queuing, solver/solver manager interactions, and barrier synchronization; such detail can be exposed as needed, e.g., when sub-problems can be solved asynchronously. Next, the index set CUTS is expanded, and the pricing parameters are extracted from the sub-problem instances and stored in the master instance: ms t r i ns t . CUTS. add( i ) 34 for s in range ( 1 , l en ( subprobl ems )+1): i ns t = s ub i ns t s [ s 1] for t in ms t r i ns t .TWOPLUSWEEKS: ms t r i ns t . t i me pr i c e [ t , s , i ] = i ns t . Time [ t ] . dual for p in ms t r i ns t .PROD: ms t r i ns t . ba l 2 pr i c e [ p , s , i ] = i ns t . Bal ance2 [ p ] . dual for p in ms t r i ns t .PROD: for t in ms t r i ns t .TWOPLUSWEEKS: ms t r i ns t . s e l l l i m p r i c e [ p , t , s , i ] = i ns t . S e l l [ p , t ] . urc The rst line in this code block dynamically expands the size of the CUTS set, adding an element corresponding to the current Benders loop iteration counter i. The code for transferring pricing information from the sub-problems to the master instance is straightforward: access of pricing parameters with a sub-index equal to i in the master instance is legal given the dynamic update of the CUTS set. Additionally, we observe the availability in Pyomo of standard variable sux information (in this case constraint dual variables and variable upper reduced costs). With pricing information available in the master instance, we can now dene the new cut for Benders iteration i as follows: cut = sum( [ ms t r i ns t . t i me pr i c e [ t , s , i ] ms t r i ns t . a va i l [ t ] \ for t in ms t r i ns t .TWOPLUSWEEKS \ for s in ms t r i ns t . SCEN] ) + \ sum( [ ms t r i ns t . ba l 2 pr i c e [ p , s , i ] (ms t r i ns t . Inv1 [ p ] ) \ for p in ms t r i ns t .PROD \ for s in ms t r i ns t . SCEN] ) + \ sum( [ ms t r i ns t . s e l l l i m p r i c e [ p , t , s , i ] \ ms t r i ns t . market [ p , t ] \ for p in ms t r i ns t .PROD \ for t in ms t r i ns t .TWOPLUSWEEKS \ for s in ms t r i ns t . SCEN] ) \ ms t r i ns t . Mi n St age2 Pr of i t ms t r i ns t . Cut Defn . add( i , ( 0 . 0 , cut , None ) ) The expression for the cut is formed using the Python sum function in conjunction with Pythons list comprehension syntax. While somewhat more complex, the fundamen- tals of constraint generation shown above are qualitatively similar to the constraint rule generation examples presented in Section 6.1. Given the cut expression, the cor- responding new element of the Cut Defn constraint set is created via the add method of the Constraint class. Here, the method arguments respectively represent (1) the constraint index and (2) a tuple consisting of the constraint (lower bound, expression, upper bound). The remainder of the runbenders script is straightforward, involving a simple loop- ing construct and checks for convergence. This example further illustrates that sophis- ticated optimization strategies requiring direct access to Pyomos modeling capabili- ties and Cooprs optimization capabilities can be easily implemented. The runbenders 35 script has roughly the same complexity and length as the original AMPL script. Addi- tionally, this script supports parallelization of this solver in a generic and straightfor- ward manner. 8.3 Generic Algorithms: Generating Extensive Forms of Stochastic Programs Moving beyond model-specic scripting, we now briey discuss capabilities of Pyomo and Python that, when used in conjunction, can be used to develop generic, model- independent algorithms. For purposes of illustration, we consider the case of stochastic linear and mixed-integer programs. A stochastic program can be viewed in terms of a scenario tree, i.e., a tree representing the evolution of model parameter uncertainty. To simplify exposition, we consider a two-stage stochastic program in which a single root node has n children, each of which represents a leaf node. Associated with each node is a set of variables. The variables correpsonding to the root node are so-called rst-stage variables, and represent decisions that must be made before the actual realization of uncertainty is revealed. The values of variables associated with the leaf nodes (also known as second-stage variables) are determined once the corresponding parameter uncertainty is revealed. For an in-depth introduction to stochastic programming, we defer to [49]. Conceptually, an n-scenario stochastic program can be viewed as a collection of n in- dependent deterministic mathematical programs (each representing a single scenario), with one key augmentation: so-called non-anticipativity constraints are added to ensure equality of the rst-stage decision variables across all n scenarios. Non-anticipativity is required to avoid prescient decision-making. This conceptualization is useful in practice, in part because it facilitates the transition from deterministic to stochastic programs, and further because many decomposition-based solution strategies explicitly operate on such a conceptualization. We now briey discuss the implementation of a generic algorithm for constructing the mathematical program associated with this conceptu- alization, which is known as the extensive form. Further details concerning stochastic programming in the context of Pyomo and Coopr can be found in [54], which details the PySP module of Coopr. To generate the extensive form of any stochastic program using Pyomo/Coopr, the rst step is to construct the individual scenario model instances. This can be accomplished with any of the approaches detailed above in Sections 8.1 and 8.2. The more dicult step is to construct the non-anticipativity constraints, in a generic and model-independent fashion. To understand the underlying programmatic mechanism, we rst discuss how scenario trees are specied by users. In PySP, there is a canonical Pyomo model of the scenario tree structure, specied as follows: s c e nar i o t r e e mode l=AbstractModel ( ) s c e nar i o t r e e mode l . St ages=Set ( ordered=True ) s c e nar i o t r e e mode l . Nodes=Set ( ) s c e nar i o t r e e mode l . NodeStage=Param( \ s c e nar i o t r e e mode l . Nodes , \ wi t hi n=s c e nar i o t r e e mode l . St ages ) 36 s c e nar i o t r e e mode l . Chi l dren=Set ( s c e nar i o t r e e mode l . Nodes , \ wi t hi n=s c e nar i o t r e e mode l . Nodes , ordered=True ) s c e nar i o t r e e mode l . Condi t i onal Pr obabi l i t y=Param( \ s c e nar i o t r e e mode l . Nodes ) s c e nar i o t r e e mode l . Sce nar i os=Set ( ordered=True ) s c e nar i o t r e e mode l . Scenari oLeaf Node= Param( \ s c e nar i o t r e e mode l . Scenar i os , \ wi t hi n=s c e nar i o t r e e mode l . Nodes ) s c e nar i o t r e e mode l . St ageVar i abl es=Set ( \ s c e nar i o t r e e mode l . St ages ) s c e nar i o t r e e mode l . St ageCost Var i abl e=Param( \ s c e nar i o t r e e mode l . St ages ) A key observation concerning this model is that all set and parameter values are strings, i.e., names given by the user to various objects. Of particular interest with respect to generation of a stochastic program extensive form is the denitions of the set StageVariables. For the well-known Birge and Louveaux farmer example (see [7]), the corresponding data are specied as follows: s e t St ageVar i abl es [ Fi r s t St age ] := DevotedAcreage [ ] ; s e t St ageVar i abl es [ SecondStage ] := Quanti tySubQuotaSol d [ ] Quanti tySuperQuotaSol d [ ] Quanti tyPurchased [ ] ; This data indicates that all indices of the variable DevotedAcreage are rst-stage deci- sion variables, for which non-anticipativity constraints must be constructed. In terms of implementation, the three key mechanisms required are (1) identication of the relevant Var objects on each Pyomo scenario instance, (2) construction of a master variable corresponding to each such object, and (3) construction of the non-anticipativity con- straints enforcing equality between the master variable and each of the corresponding scenario instance variables. The following code fragment illustrates the implementation of these three steps; we have intentionally omitted various error-checking mechanisms (vital in the real PySP implementation), in addition to the code translating the scenario tree information above into a traversable, object-oriented structure: e f i ns t a nc e = ConcreteModel ( ) # t he s t ag e ob j e c t i s a component of an obj ect or i e nt e d # encodi ng of t he s cenar i o t r e e s t r uc t ur e . t he code bel ow i s # execut ed f or each st age , al t hough we onl y show a s i ng l e # s t ag e f or compactness . for s t a g e r e f e r e nc e v a r i a bl e in s t age . v a r i a bl e s : # f or a l l but t he l a s t s t ag e . . . for t r ee node in s t age . t r e e node s [ : 1 ] : 37 s t age var i abl e name = s t a g e r e f e r e nc e v a r i a bl e . name s t a g e v a r i a bl e i ndi c e s = \ t r ee node . v a r i a b l e i n d i c e s [ s t age var i abl e name ] e f var i abl e name = . . . # some uni que name e f v a r i a bl e i nde x = ge t at t r ( r e f e r e nc e i ns t anc e , \ s t age var i abl e name ) . i ndex e f v a r i a b l e = Var ( e f va r i a bl e i nde x , name=e f var i abl e name ) s e t a t t r ( e f i ns t anc e , ef var i abl e name , e f v a r i a b l e ) for i ndex in s t a g e v a r i a bl e i ndi c e s : for s c e na r i o i ns t a nc e in t r ee node . s c e na r i o s : s c e na r i o v a r i a bl e = ge t at t r ( s c e nar i o i ns t anc e , \ s t age var i abl e name ) def makeEqual i tyRul e ( e f va r i a bl e , s c e na r i o var i abl e , \ i ndex ) : def equal i t yRul e ( model ) : return ( 0 . 0 , \ e f v a r i a b l e [ i ndex ] s c e na r i o v a r i a bl e [ i ndex ] , \ 0. 0) return equal i t yRul e e f c ons t r ai nt name = . . . # some uni que name e f c o ns t r a i nt =Cons t r ai nt ( name=ef cons t r ai nt name , \ r ul e=makeEqual i tyRul e ( e f va r i a bl e , \ s c e nar i o var i abl e , i ndex ) ) s e t a t t r ( e f i ns t anc e , ef cons t r ai nt name , \ e f c o ns t r a i nt ) The most salient observation regarding this code is its compactness, which is facil- itated by a combination of Pyomo and Python functionality. In particular, the Python function getattr function is used to extract attributes of objects by name, at run- time something that is not possible in statically typed languages such as C++. This capability, known as introspection, is used in the above fragment to access the Var ob- jects on each Pyomo scenario instance in a completely generic and model-independent manner. Other features worth noting include (1) the ability to clone Var objects using index sets (as occurs when the ef variables are constructed), (2) the use of the Python function setattr to add components to the extensive form model instance, and (3) the use of in-line function denitions to construct the expressions associated with the non-anticipativity constraints. The example above serves to illustrate the use of Pyomo/Coopr in the role of algorithm development, as opposed to just simple scripting. In particular, generic algo- rithms can be constructed with relatively little code. For example, the full construction routine for the extensive form in PySP is approximately 300 lines (including all white- space and error checking). Similarly, the core code for a complex decomposition solver 38 strategy (Progressive Hedging) is approximately 1000 lines. While we are not advocat- ing the development of core numerical routines in Pyomo/Coopr (e.g., mixed-integer solvers), our eorts to date indicate that such an approach is eective when develop- ing high-level computational strategies that in turn leverage fast numerical routines through, for example, the use of solver plugins. 9 Getting Started with Pyomo The installation of Pyomo is complicated by the fact that Coopr is comprised of an ensemble of Python packages, as well as third-party numerical software libraries. Coopr provides an installation script, coopr install, that leverages Pythons online package index [42] to install Coopr-related Python packages. For example, in Linux and MacOS the command: c o o p r i n s t a l l coopr will create a coopr directory that contains a virtual Python installation. All Coopr scripts will be installed in the coopr/bin directory, and these scripts are congured to automatically import the Coopr and PyUtilib libraries, in addition to other necessary Python packages. Similarly, in MS Windows the command: python c o o p r i n s t a l l coopr will create a coopr directory containing a virtual Python installation. The Coopr scripts will be installed in the coopr/bin directory, including *.cmd scripts that are recognized as DOS commands. On all platforms, the only additional user requirement is that the PATH environment variable be updated to include coopr/bin (Linux and MacOS) or coopr\bin (MS Windows). Windows installer executables are also available. By default, the coopr install script installs the latest ocial release of all Coopr- related Python packages. The coopr install script also includes options for installing trunk versions (via the --trunk option) or the latest stable versions (via the --stable option) of the Coopr and PyUtilib Python packages. This facilitates collaborations with non-developers to x bugs and try out new Coopr capabilities. More information regarding Coopr is available on the Coopr wiki page, available at: https://software.sandia.gov/trac/coopr/. This Trac site includes detailed in- stallation instructions, and provides a mechanism for users to submit tickets for feature requests and bugs. Coopr is a COIN-OR software package [10]; further information can be obtained by e-mailing the associated COIN-OR or Google groups mailing lists. 10 Discussion and Conclusions Pyomo was developed and is actively supported for real-world optimization applications at Sandia National Laboratories [27]. Our experience to date with Pyomo and Coopr has validated our initial assessment that Python is an eective language for supporting the development and deployment of solutions to optimization applications. Although it is clear that custom AMLs can support a more concise and mathematically intuitive syntax, Pythons clean syntax and programming model make it a natural choice for optimization tools like Pyomo. 39 It is noteworthy that the use of Python for developing Pyomo has proven quite strategic. First, the development of Pyomo did not require implementation of a parser and the eort associated with cross-platform deployment, both of which are neces- sary for the development of an AML. Second, Pyomo users have been able to rely on Pythons extensive documentation to rapidly get up-to-speed without relying on developers to provide detailed language documentation. Although general use of Py- omo requires documentation of Pyomo-specic features, this is a much smaller burden than the language documentation required for optimization AMLs. Finally, we have demonstrated that Pyomo can eectively leverage Pythons rich set of standard and third-party libraries to support advanced computing capabilities like distributed exe- cution of optimizers. This clearly distinguishes Pyomo from custom AMLs, and it frees Pyomo and Coopr developers to focus on innovative optimization capabilities. Pyomo and Coopr were publicly released as an open source project in 2008. Future development will focus on several key design issues: Optimized Expression Trees - Our scaling experiments suggest that Pyomos run- time performance can be signicantly improved by using a dierent representation for expression trees. The representation of expression trees could be reworked to avoid frequent object construction, either through a low-level representation or a Python extension library. Extending the Range of Solver Plugins - We plan to expand the suite of pre-written solver plugins to include other commercial solvers, e.g., XpressMP. Additionally, we plan to leverage optimizers that are available in other Python optimization packages, which are particularly interesting when solving nonlinear formulations. Direct Optimizer Interfaces - Currently, most Coopr solvers are invoked via system calls. However, direct library interfaces to optimizers are also possible. For exam- ple, both CPLEX and GUROBI ship with Python library bindings. Direct solver interfaces promise to yield improved performance, due to the lack of a need for le manipulation and the overhead associated with spawning processes. Remote Solver Execution - Coopr currently supports solver managers for remote solver execution using Pyro. A preliminary interface to NEOS [16] has been devel- oped, but this solver manager currently only supports CBC. We plan to extend this interface, and to develop interfaces for Optimization Services [20], as well as cloud computing solvers based on, for example, Amazons EC2 compute cloud. Acknowledgements We thank Jon Berry, Robert Carr, Carl Laird, Cindy Phillips, and John Siirola for their critical feedback on the design of Pyomo, and David Gay for developing the Coopr inter- faces to AMPL NL and SOL les. We would like to thank Gabe Hackebeil for providing the hybrid optimization scripting example. Additionally, we thank the anonymous ref- erees and associate editor for their suggestions, which served to signicantly improve the paper. Sandia is a multiprogram laboratory operated by Sandia Corporation, a Lockheed Martin Company, for the United States Department of Energys National Nuclear Security Administration under Contract DE-AC04-94-AL85000. 40 References 1. ACRO. ACRO optimization framework. http://software.sandia.gov/acro, 2009. 2. AIMMS. AIMMS home page. http://www.aimms.com, 2008. 3. AMPL. AMPL home page. http://www.ampl.com/, 2008. 4. Prasanth Anbalagan and Mladen Vouk. On reliability analysis of open source software - FEDORA. In 19th International Symposium on Software Reliability Engineering, 2008. 5. APLEpy. APLEpy: An open source algebraic programming language extension for Python. http://aplepy.sourceforge.net/, 2005. 6. Dimitris Bertsimas and John N. Tsitsiklis. Introduction to Linear Optimization. Athena Scientic / Dynamic Ideas, 1997. 7. J.R. Birge and F. Louveaux. Introduction to Stochastic Programming. Springer, 1997. 8. Bonmin. The Bonmin wiki page. https://projects.coin-or.org/Bonmin, 2011. 9. BSD. Open Source Initiative (OSI) - the BSD license. http://www.opensource. org/licenses/bsd-license.php, 2009. 10. COINOR. COIN-OR home page. http://www.coin-or.org, 2009. 11. Forrester Consulting. Open source softwares expanding role in the enter- prise. http://www.unisys.com/eprise/main/admin/corporate/doc/Forrester research-open source buying behaviors.pdf, 2007. 12. COOPR. Coopr: A common optimization python repository. http://software. sandia.gov/coopr, 2009. 13. CUTEr. Cuter: A constrained and unconstrained testing environment, revisited. http://www.hsl.rl.ac.uk/cuter-www/index.html, 2011. 14. CVXOPT. CVXOPT home page. http://abel.ee.ucla.edu/cvxopt, 2008. 15. dimitrov. Nedialki dimitrov, naval postgraduate school. Personal Communication, 2011. 16. Elizabeth D. Dolan, Robert Fourer, Jean-Pierre Goux, Todd S. Munson, and Jason Sarich. Kestrel: An interface from optimization modeling systems to the NEOS server. INFORMS Journal on Computing, 20(4):525538, 2008. 17. FLOPC++. FLOPC++ home page. https://projects.coin-or.org/FlopC++, 2008. 18. Robert Fourer, David M. Gay, and Brian W. Kernighan. AMPL: A mathematical programming language. Management Science, 36:519554, 1990. 19. Robert Fourer, David M. Gay, and Brian W. Kernighan. AMPL: A Modeling Lan- guage for Mathematical Programming, 2nd Ed. Brooks/ColeThomson Learning, Pacic Grove, CA, 2003. 20. Robert Fourer, Jun Ma, and Kipp Martin. Optimization services: A framework for distributed optimization. Mathematical Programming, 2008. (submitted). 21. GAMS. GAMS home page. http://www.gams.com, 2008. 22. Arthur M. Georion. An introduction to structured modeling. Management Sci- ence, 33(5):547588, 1987. 23. GLPK. GLPK: GNU linear programming toolkit. http://www.gnu.org/ software/glpk/, 2009. 24. GPL. GNU general public license. http://www.gnu.org/licenses/gpl.html, 2009. 41 25. Gabriel Hackebeil and Carl Laird. Global optimization for estimation of on/o seasonality in infectious disease spread using pyomo. https://software.sandia. gov/trac/coopr/attachment/wiki/Pyomo/global opt.pptx, 2010. 26. William E. Hart. Python Optimization Modeling Objects (Pyomo). In J. W. Chinneck, B. Kristjansson, and M. J. Saltzman, editors, Operations Research and Cyber-Infrastructure, 2009. doi: 10.1007/978-0-387-88843-9 1. 27. William E. Hart, Cynthia A. Phillips, Jonathan Berry, Erik G. Boman, et al. US Environmental Protection Agency uses operations research to reduce contamina- tion risks in drinking water. INFORMS Interfaces, 39:5768, 2009. 28. Emmanuel Hebrard, Eoin OMahony, and Barry OSullivan. Constraint Program- ming and Combinatorial Optimisation in Numberjack. In A. Lodi, M. Milano, and P. Toth, editors, Proceedings of CPAIOR 2010, Springer LNCS 6140, 2010. 29. Ipopt. The Ipopt wiki page. https://projects.coin-or.org/Ipopt, 2011. 30. Eric Jones, Travis Oliphant, Pearu Peterson, et al. SciPy: Open source scientic tools for Python, 2009. URL http://www.scipy.org/. 31. Josef Kallrath. Modeling Languages in Mathematical Optimization. Kluwer Aca- demic Publishers, 2004. 32. Suleyman Karabuk and F. Hank Grant. A common medium for programming operations-research models. IEEE Software, pages 3947, 2007. 33. Roy E. Marsten. The design of the XMP linear programming library. ACM Trans- actions on Mathematical Software, 7(4):481497, 1981. 34. Travis E. Oliphant. Python for scientic computing. Computing in Science and Engineering, pages 1020, May/June 2007. 35. OpenOpt. OpenOpt home page. http://scipy.org/scipy/scikits/wiki/ OpenOpt, 2008. 36. OptimJ. Ateji home page. http://www.ateji.com, 2008. 37. ortools. Google OR tools - operations research tools developed at Google. http: //code.google.com/p/or-tools, 2011. 38. Lutz Prechelt. An empirical comparison of seven programming languages. Com- puter, 33(10):2329, 2000. ISSN 0018-9162. doi: http://dx.doi.org/10.1109/2. 876288. 39. Psyco. Psyco. http://psyco.sourceforge.net/, 2008. 40. PuLP. PuLP: A Python linear programming modeler. http://130.216.209.237/ engsci392/pulp/FrontPage, 2008. 41. PyMathProg. PyMathProg home page. http://pymprog.sourceforge.net/, 2009. 42. PyPI. Python package index. http://pypi.python.org/pypi, 2009. 43. PYRO. PYRO: Python remote objects. http://pyro.sourceforge.net, 2009. 44. Python. Python programming language ocial website. http://python.org, 2009. 45. PythonVSJava. Python & Java: A side-by-side comparison. http://www.ferg. org/projects/python java side-by-side.html, 2008. 46. PyUtilib. PyUtilib optimization framework. http://software.sandia.gov/ pyutilib, 2009. 47. Marcel Roelofs and Johannes Bisschop. AIMMS 3.9 The Users Guide. lulu.com, 2009. 48. Gigi Sayfan. Building your own plugin framework. Dr. Dobbs Journal, Nov 2007. 49. Alexander Shapiro, Darinka Dentcheva, and Andrzej Ruszczynski. Lectures on Stochastic Programming: Modeling and Theory. Society for Industrial and Applied 42 Mathematics, 2009. 50. William Stein. Sage: Open Source Mathematical Software (Version 2.10.2). The Sage Group, 2008. http://www.sagemath.org. 51. C. Szyperski. Component Software: Beyond Object-Oriented Programming. ACM Press, 1998. 52. TOMLAB. TOMLAB optimization environment. http://www.tomopt.com/ tomlab, 2008. 53. Laurence Tratt. Dynamically typed languages. Advances in Computers, 77:149 184, July 2009. 54. Jean-Paul Watson, David L. Woodru, and William E. Hart. Pysp: Modeling and solving stochastic programs in python. https://software.sandia.gov/trac/ coopr/attachment/wiki/PySP/pysp jnl.pdf, 2010. 55. YAML. The ocial YAML web site. http://yaml.org/, 2009. 56. Ying Zhou and Joseph Davis. Open source software reliability model: An empirical approach. ACM SIGSOFT Software Engineering Notes, 30:16, 2005.