Skip to main content

Hoong Lau

Followers

61

Following

12

Co-authors

9

Public Views

Université de Mons

Iman Attarzadeh

Islamic Azad University of Tehran, Central branch

University of Cambridge

Massimiliano Ruocco

Norwegian University of Science and Technology

Alexander Serebrenik

Eindhoven University of Technology

Viacheslav Kuleshov

Stockholm University

Dhanesh Sambariya

Rajasthan Technical University

Applied Science University

Bogdan Vasilescu

Carnegie Mellon University

LNM Institute of Information Technology (LNMIIT) Jaipur

Interests

Uploads

Papers by Hoong Lau

Online Control of Adaptive Large Neighborhood Search using Deep Reinforcement Learning

arXiv (Cornell University), Nov 1, 2022

The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving... more The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving combinatorial optimization problems (COPs). Nonetheless, the performance of ALNS relies on the proper configuration of its selection and acceptance parameters, which is known to be a complex and resource-intensive task. To address this, we introduce a Deep Reinforcement Learning (DRL) based approach called DR-ALNS that selects operators, adjusts parameters, and controls the acceptance criterion throughout the search. The proposed method aims to learn, based on the state of the search, to configure ALNS for the next iteration to yield more effective solutions for the given optimization problem. We evaluate the proposed method on an orienteering problem with stochastic weights and time windows, as presented in an IJCAI competition. The results show that our approach outperforms vanilla ALNS, ALNS tuned with Bayesian optimization, and two state-of-the-art DRL approaches that were the winning methods of the competition, achieving this with significantly fewer training observations. Furthermore, we demonstrate several good properties of the proposed DR-ALNS method: it is easily adapted to solve different routing problems, its learned policies perform consistently well across various instance sizes, and these policies can be directly applied to different problem variants.

DIRECT: A Scalable Approach for Route Guidance in Selfish Orienteering Problems

Adaptive Agents and Multi-Agents Systems, May 4, 2015

We address the problem of crowd congestion at venues like theme parks, museums and world expos by... more We address the problem of crowd congestion at venues like theme parks, museums and world expos by providing route guidance to multiple selfish users (with budget constraints) moving through the venue simultaneously. To represent these settings, we introduce the Selfish Orienteering Problem (SeOP) that combines two well studied problems from literature, namely Orienteering Problem (OP) and Selfish Routing (SR). OP is a single agent routing problem where the goal is to minimize latency (or maximize reward) in traversing a subset of nodes while respecting budget constraints. SR is a game between selfish agents looking for minimum latency routes from source to destination along edges of a network available to all agents. Thus, SeOP is a multi-agent planning problem where agents have selfish interests and individual budget constraints. As with Selfish Routing, we employ Nash Equilibrium as the solution concept in solving SeOP. A direct mathematical program formulation to find a Nash equilibrium in SeOP cannot scale because the number of constraints is quadratic in the number of paths, which itself is an exponential quantity. To address scalability issues, we make two key contributions. First, we provide a compact non-pairwise formulation with linear number of constraints in the number of paths to enforce the equilibrium condition. Second, we introduce DIRECT, an incremental and iterative master-slave decomposition approach to compute an approximate equilibrium solution. Similar to existing flow based approaches, DIRECT is scale invariant in the number of agents. We also provide a theoretical discussion of our approximation quality and present extensive empirical results on synthetic and real-world graphs demonstrating the scalability of combining DIRECT with our non-pairwise formulation.

A Proactive Sampling Approach to Project Scheduling under Uncertainty

Proceedings of the ... AAAI Conference on Artificial Intelligence, Mar 5, 2016

Uncertainty in activity durations is a key characteristic of many real world scheduling problems ... more Uncertainty in activity durations is a key characteristic of many real world scheduling problems in manufacturing, logistics and project management. RCPSP/max with durational uncertainty is a general model that can be used to represent durational uncertainty in a wide variety of scheduling problems where there exist resource constraints. However, computing schedules or execution strategies for RCPSP/max with durational uncertainty is NP-hard and hence we focus on providing approximation methods in this paper. We provide a principled approximation approach based on Sample Average Approximation (SAA) to compute proactive schedules for RCPSP/max with durational uncertainty. We further contribute an extension to SAA for improving scalability significantly without sacrificing on solution quality. Not only is our approach able to compute schedules at comparable runtimes as existing approaches, it also provides lower α-quantile makespan (also referred to as α-robust makespan) values than the best known approach on benchmark problems from the literature.

Dynamic Area Coverage using Faulty Multi-Agent Swarms

We consider the problem of resource allocation and scheduling where information and decisions are... more We consider the problem of resource allocation and scheduling where information and decisions are decentralized, and our goal is to propose a market mechanism that allows resources from a central resource pool to be allocated to distributed decision makers (agents) that seek to optimize their respective scheduling goals. We propose a generic combinatorial auction mechanism that allows agents to competitively bid for the resources needed in a multi-period setting, regardless of the respective scheduling problem faced by the agent, and show how agents can design optimal bidding strategies to respond to price adjustment strategies from the auctioneer. We apply our approach to handle real-time large-scale dynamic resource coordination in a mega-scale container terminal.

PRESS: PeRsonalized Event Scheduling recommender System (Demonstration)

Adaptive Agents and Multi-Agents Systems, May 9, 2016

This paper presents a personalized event scheduling recommender system, PRESS, for a large confer... more This paper presents a personalized event scheduling recommender system, PRESS, for a large conference setting with multiple parallel tracks. PRESS is a mobile application that gathers personalized information from a user and recommends talks/demos to be attend. The input from a user include a list of keyword preferences and (optionally) preferred talks. We use the MALLET topic model package to analyze the set of conference papers and classify them based on automatically identified topics. We propose an algorithm to generate a list of recommended papers based on the user keywords and the MALLET topics. An optimization model is then applied to obtain a feasible schedule. The recommended set is matched against the selected papers by the user which we obtained from a survey conducted at AAMAS-15 in Istanbul, Turkey. We show that PRESS is able to provide reasonable accuracy, precision and recall rates. PRESS will be deployed live during AAMAS-16 in Singapore.

Robust Partial Order Schedules for RCPSP/max with Durational Uncertainty

Proceedings of the International Conference on Automated Planning and Scheduling, Mar 30, 2016

In this work, we consider RCPSP/max with durational uncertainty. We focus on computing robust Par... more In this work, we consider RCPSP/max with durational uncertainty. We focus on computing robust Partial Order Schedules (or, in short POS) which can be executed with risk controlled feasibility and optimality, i.e., there is stochastic posteriori quality guarantee that the derived POS can be executed with all constraints honored and completion before robust makespan. To address this problem, we propose BACCHUS: a solution method on Benders Accelerated Cut Creation for Handling Uncertainty in Scheduling. In our proposed approach, we first give an MILP formulation for the deterministic RCPSP/max and partition the model into POS generation process and start time schedule determination. Then we develop Benders algorithm and propose cut generation scheme designed for effective convergence to optimality for RCPSP/max. To account for durational uncertainty, we extend the deterministic model by additional consideration of duration scenarios. In the extended MILP, the risks of constraint violation and failure to meet robust makespan are counted during POS exploration. We then approximate the uncertainty problem with computing a risk value related percentile of activity durations from the uncertainty distributions. Finally, we apply Pareto cut generation scheme and propose heuristics for infeasibility cuts to accelerate the algorithm process. Experimental results demonstrate that BACCHUS efficiently and effectively generates robust solutions for scheduling under uncertainty.

Towards Finding Robust Execution Strategies for RCPSP/max with Durational Uncertainty

Proceedings of the International Conference on Automated Planning and Scheduling, May 25, 2021

Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max) h... more Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max) have been studied extensively in the literature. However, the more realistic RCPSP/max problems-ones where durations of activities are not known with certainty-have received scant interest and hence are the main focus of the paper. Towards addressing the significant computational complexity involved in tackling RCPSP/max with durational uncertainty, we employ a local search mechanism to generate robust schedules. In this regard, we make two key contributions: (a) Introducing and studying the key properties of a new decision rule to specify start times of activities with respect to dynamic realizations of the duration uncertainty; and (b) Deriving the fitness function that is used to guide the local search towards robust schedules. Experimental results show that the performance of local search is improved with the new fitness evaluation over the best known existing approach.

Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings

Efficient sequential matching of supply and demand is a problem of interest in many online to off... more Efficient sequential matching of supply and demand is a problem of interest in many online to offline services. For instance, Uber, Lyft, Grab for matching taxis to customers; Ubereats, Deliveroo, FoodPanda etc for matching restaurants to customers. In these online to offline service problems, individuals who are responsible for supply (e.g., taxi drivers, delivery bikes or delivery van drivers) earn more by being at the "right" place at the "right" time. We are interested in developing approaches that learn to guide individuals to be in the "right" place at the "right" time (to maximize revenue) in the presence of other similar "learning" individuals and only local aggregated observation of other agents states (e.g., only number of other taxis in same zone as current agent). Existing approaches in Multi-Agent Reinforcement Learning (MARL) are either not scalable (e.g., about 40000 taxis/cars for a city like Singapore) or assumptions of common objective or action coordination or centralized learning are not viable. A key characteristic of the domains of interest is that the interactions between individuals are anonymous, i.e., the outcome of an interaction (competing for demand) is dependent only on the number and not on the identity of the agents. We model these problems using the Anonymous MARL (Ay-MARL) model. To ensure scalability and individual learning, we focus on improving performance of independent reinforcement learning methods, specifically Deep Q-Networks (DQN) and Advantage Actor Critic (A2C) for AyMARL. The key contribution of this paper is in employing principle of maximum entropy to provide a general framework of independent learning that is both empirically effective (even with only local aggregated information of agent population distribution) and theoretically justified. Finally, our approaches provide a significant improvement with respect to joint and individual revenue on a generic simulator for online to offline services and a real world taxi problem over existing approaches. More importantly, this is achieved while having the least variance in revenues earned by the learning individuals, an indicator of fairness.

Learning and Exploiting Shaped Reward Models for Large Scale Multiagent RL

Proceedings of the International Conference on Automated Planning and Scheduling

Many real world systems involve interaction among large number of agents to achieve a common goal... more Many real world systems involve interaction among large number of agents to achieve a common goal, for example, air traffic control. Several model-free RL algorithms have been proposed for such settings. A key limitation is that the empirical reward signal in model-free case is not very effective in addressing the multiagent credit assignment problem, which determines an agent's contribution to the team's success. This results in lower solution quality and high sample complexity. To address this, we contribute (a) an approach to learn a differentiable reward model for both continuous and discrete action setting by exploiting the collective nature of interactions among agents, a feature commonly present in large scale multiagent applications; (b) a shaped reward model analytically derived from the learned reward model to address the key challenge of credit assignment; (c) a model-based multiagent RL approach that integrates shaped rewards into well known RL algorithms such as...

A Hybrid Framework Using a QUBO Solver For Permutation-Based Combinatorial Optimization

arXiv (Cornell University), Sep 27, 2020

Hardware limitation imposes a challenge to apply quadratic unconstrained binary optimization (QUB... more Hardware limitation imposes a challenge to apply quadratic unconstrained binary optimization (QUBO) / Ising model solvers directly to solve large combinatorial optimization problem instances. In this paper, we propose a hybrid framework to solve large-scale permutation-based combinatorial problems effectively using a QUBO solver efficiently and effectively. To achieve this, several issues need to be addressed. First, we convert a constrained optimization model into an unconstrained model by introducing a penalty coefficient. Second, to ensure effective search of good quality solutions, a smooth energy landscape is needed; we propose a data scaling approach that reduces the amplitudes of the input without compromising optimality. Third, parameter tuning is needed, and in this paper, we illustrate that for certain problems, it suffices to perform random sampling on the penalty coefficients to achieve good performance. Fourth, we address the challenges of using a QUBO solver that may only work on small scale problems due to hardware limitation and also due to general performance deterioration of a QUBO solver as the size grows; we introduce a decomposition approach that calls the QUBO solver repeatedly on small sub-problems. Finally, to handle the possible infeasibility of the QUBO solution, we introduce a polynomial-time projection algorithm. We show our performance on the Traveling Salesman Problem and Flow Shop Scheduling Problem. We empirically compare the performance of our approach with solving the optimization problems directly. We also compare the performance with an exact solver.

GRAND-VISION: An Intelligent System for Optimized Deployment Scheduling of Law Enforcement Agents

Proceedings of the International Conference on Automated Planning and Scheduling

Law enforcement agencies in dense urban environments, faced with a wide range of incidents to han... more Law enforcement agencies in dense urban environments, faced with a wide range of incidents to handle and limited manpower, are turning to data-driven AI to inform their policing strategy. In this paper we present a patrol scheduling system called GRAND-VISION: Ground Response Allocation and Deployment - Visualization, Simulation, and Optimization. The system employs deep learning to generate incident sets that are used to train a patrol schedule that can accommodate varying manpower, break times, manual pre-allocations, and a variety of spatio-temporal demand features. The complexity of the scenario results in a system with real world applicability, which we demonstrate through simulation on historical data obtained from a large urban law enforcement agency.

Multi-Agent Task Assignment for Mobile Crowdsourcing under Trajectory Uncertainties

Adaptive Agents and Multi-Agents Systems, May 4, 2015

In this work, we investigate the problem of mobile crowdsourcing, where workers are financially m... more In this work, we investigate the problem of mobile crowdsourcing, where workers are financially motivated to perform location-based tasks physically. Unlike current industry practice that relies on workers to manually browse and filter tasks to perform, we intend to automatically make task recommendations based on workers' historical trajectories and desired time budgets. However, predicting workers' trajectories is inevitably faced with uncertainties, as no one will take exactly the same route every day; yet such uncertainties are oftentimes abstracted away in the known literature. In this work, we depart from the deterministic modeling and study the stochastic task recommendation problem where each worker is associated with several predicted routine routes with probabilities. We formulate this problem as a stochastic integer linear program whose goal is to maximize the expected total utility achieved by all workers. We further exploit the separable structure of the formulation and apply the Lagrangian relaxation technique to scale up the solution approach. Experiments have been performed over the instances generated using the real Singapore transportation network. The results show that we can find significantly better solutions than the deterministic formulation.

Mechanisms for arranging ride sharing and fare splitting for last-mile travel demands

Adaptive Agents and Multi-Agents Systems, May 5, 2014

A great challenge of city planners is to provide efficient and effective connection service to tr... more A great challenge of city planners is to provide efficient and effective connection service to travelers using public transportation system. This is commonly known as the last-mile problem and is critical in promoting the utilization of public transportation system. In this paper, we address the last-mile problem by considering a dynamic and demandresponsive mechanism for arranging ride sharing on a nondedicated commercial fleet (such as taxis or passenger vans). Our approach has the benefits of being dynamic, flexible, and with low setup cost. A critical issue in such ride-sharing service is how riders should be grouped and serviced, and how fares should be split. We propose two auction designs which are used to solicit individual rider's willing payment rate and compensation rate (for extra travel, if any). We demonstrate that these two auctions are budget balanced, individually rational, and incentive compatible. A series of experimental studies based on both synthetic and real-world datasets are designed to demonstrate the pros and cons of our two proposed auction mechanisms in various settings.

Designing a Portfolio of Parameter Configurations for Online Algorithm Selection

Algorithm portfolios seek to determine an effective set of algorithms that can be used within an ... more Algorithm portfolios seek to determine an effective set of algorithms that can be used within an algorithm selection framework to solve problems. A limited number of these portfolio studies focus on generating different versions of a target algorithm using different parameter configurations. In this paper, we employ a Design of Experiments (DOE) approach to determine a promising range of values for each parameter of an algorithm. These ranges are further processed to determine a portfolio of parameter configurations, which would be used within two online Algorithm Selection approaches for solving different instances of a given combinatorial optimization problem effectively. We apply our approach on a Simulated Annealing-Tabu Search (SA-TS) hybrid algorithm for solving the Quadratic Assignment Problem (QAP) as well as an Iterated Local Search (ILS) on the Travelling Salesman Problem (TSP). We also generate a portfolio of parameter configurations using best-of-breed parameter tuning a...

Scalable Urban Mobile Crowdsourcing

ACM Transactions on Intelligent Systems and Technology, 2018

In this article, we investigate effective ways of utilizing crowdworkers in providing various urb... more In this article, we investigate effective ways of utilizing crowdworkers in providing various urban services. The task recommendation platform that we design can match tasks to crowdworkers based on workers’ historical trajectories and time budget limits, thus making recommendations personal and efficient. One major challenge we manage to address is the handling of crowdworker’s trajectory uncertainties. In this article, we explicitly allow multiple routine routes to be probabilistically associated with each worker. We formulate this problem as an integer linear program whose goal is to maximize the expected total utility achieved by all workers. We further exploit the separable structures of the formulation and apply the Lagrangian relaxation technique to scale up computation. Numerical experiments have been performed over the instances generated using the realistic public transit dataset in Singapore. The results show that we can find significantly better solutions than the determ...

Proceedings of the 14th Annual International Conference on Electronic Commerce - ICEC '12

Interacting Knapsack Problem in Designing Resource Bundles

Robust execution strategies for project scheduling with unreliable resources and stochastic durations

Journal of Scheduling, Mar 12, 2015

The Resource Constrained Project Scheduling Problem with minimum and maximum time lags (RCPSP/max... more The Resource Constrained Project Scheduling Problem with minimum and maximum time lags (RCPSP/max) is a general model for resource scheduling in many real-world problems (such as manufacturing and construction engineering). We consider RCPSP/max problems where the durations of activities are stochastic and resources can have unforeseen breakdowns. Given a level of allowable risk, α, our mechanisms aim to compute the minimum robust makespan execution strategy. Robust makespan for an execution strategy is any makespan value that has a risk less than α. The risk for a makespan value, M given an execution strategy is the probability that a schedule instantiated from the execution strategy will not finish before M given the uncertainty over durations and resources. We make three key contributions: (a) Firstly, we provide an analytical evaluation of resource breakdowns and repairs on executions of activities; (b) We then incorporate such information into a local search framework and generate execution strategies that can absorb resource and durational uncertainties; and (c) Finally, to improve robustness of resulting strategies, we propose Resource Breakdown Aware Chaining procedure with three different metrics. This chaining procedure computes resource allocations by predicting the effect of breakdowns on robustness of generated strategies. Experiments show effectiveness of our proposed methods in providing more robust execution strategies under uncertainty.

Fine-Tuning Algorithm Parameters Using the Design of Experiments Approach

Lecture Notes in Computer Science, 2011

Collaborative Urban Logistics – Synchronizing the Last Mile a Singapore Research Perspective

Procedia - Social and Behavioral Sciences, Mar 1, 2014

The synchronized last mile logistics concept seeks to address, through coordinated collaboration,... more The synchronized last mile logistics concept seeks to address, through coordinated collaboration, several challenges that hinder reliability, cost efficiency, effective resource planning, scheduling and utilization; and increasingly, sustainability objectives. Subsequently, the meeting of service level and contractual commitments are competitively impacted with any loss of efficiency. These challenges, against a backdrop of Singapore, can essentially be addressed in selected industry sectors through a better understanding of logistics structures; innovative supply chain designs and coordination of services, operations and processes coupled with concerted policies and supply chain strategies.

Online Control of Adaptive Large Neighborhood Search using Deep Reinforcement Learning

arXiv (Cornell University), Nov 1, 2022

The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving... more The Adaptive Large Neighborhood Search (ALNS) algorithm has shown considerable success in solving combinatorial optimization problems (COPs). Nonetheless, the performance of ALNS relies on the proper configuration of its selection and acceptance parameters, which is known to be a complex and resource-intensive task. To address this, we introduce a Deep Reinforcement Learning (DRL) based approach called DR-ALNS that selects operators, adjusts parameters, and controls the acceptance criterion throughout the search. The proposed method aims to learn, based on the state of the search, to configure ALNS for the next iteration to yield more effective solutions for the given optimization problem. We evaluate the proposed method on an orienteering problem with stochastic weights and time windows, as presented in an IJCAI competition. The results show that our approach outperforms vanilla ALNS, ALNS tuned with Bayesian optimization, and two state-of-the-art DRL approaches that were the winning methods of the competition, achieving this with significantly fewer training observations. Furthermore, we demonstrate several good properties of the proposed DR-ALNS method: it is easily adapted to solve different routing problems, its learned policies perform consistently well across various instance sizes, and these policies can be directly applied to different problem variants.

DIRECT: A Scalable Approach for Route Guidance in Selfish Orienteering Problems

Adaptive Agents and Multi-Agents Systems, May 4, 2015

We address the problem of crowd congestion at venues like theme parks, museums and world expos by... more We address the problem of crowd congestion at venues like theme parks, museums and world expos by providing route guidance to multiple selfish users (with budget constraints) moving through the venue simultaneously. To represent these settings, we introduce the Selfish Orienteering Problem (SeOP) that combines two well studied problems from literature, namely Orienteering Problem (OP) and Selfish Routing (SR). OP is a single agent routing problem where the goal is to minimize latency (or maximize reward) in traversing a subset of nodes while respecting budget constraints. SR is a game between selfish agents looking for minimum latency routes from source to destination along edges of a network available to all agents. Thus, SeOP is a multi-agent planning problem where agents have selfish interests and individual budget constraints. As with Selfish Routing, we employ Nash Equilibrium as the solution concept in solving SeOP. A direct mathematical program formulation to find a Nash equilibrium in SeOP cannot scale because the number of constraints is quadratic in the number of paths, which itself is an exponential quantity. To address scalability issues, we make two key contributions. First, we provide a compact non-pairwise formulation with linear number of constraints in the number of paths to enforce the equilibrium condition. Second, we introduce DIRECT, an incremental and iterative master-slave decomposition approach to compute an approximate equilibrium solution. Similar to existing flow based approaches, DIRECT is scale invariant in the number of agents. We also provide a theoretical discussion of our approximation quality and present extensive empirical results on synthetic and real-world graphs demonstrating the scalability of combining DIRECT with our non-pairwise formulation.

A Proactive Sampling Approach to Project Scheduling under Uncertainty

Proceedings of the ... AAAI Conference on Artificial Intelligence, Mar 5, 2016

Uncertainty in activity durations is a key characteristic of many real world scheduling problems ... more Uncertainty in activity durations is a key characteristic of many real world scheduling problems in manufacturing, logistics and project management. RCPSP/max with durational uncertainty is a general model that can be used to represent durational uncertainty in a wide variety of scheduling problems where there exist resource constraints. However, computing schedules or execution strategies for RCPSP/max with durational uncertainty is NP-hard and hence we focus on providing approximation methods in this paper. We provide a principled approximation approach based on Sample Average Approximation (SAA) to compute proactive schedules for RCPSP/max with durational uncertainty. We further contribute an extension to SAA for improving scalability significantly without sacrificing on solution quality. Not only is our approach able to compute schedules at comparable runtimes as existing approaches, it also provides lower α-quantile makespan (also referred to as α-robust makespan) values than the best known approach on benchmark problems from the literature.

Dynamic Area Coverage using Faulty Multi-Agent Swarms

We consider the problem of resource allocation and scheduling where information and decisions are... more We consider the problem of resource allocation and scheduling where information and decisions are decentralized, and our goal is to propose a market mechanism that allows resources from a central resource pool to be allocated to distributed decision makers (agents) that seek to optimize their respective scheduling goals. We propose a generic combinatorial auction mechanism that allows agents to competitively bid for the resources needed in a multi-period setting, regardless of the respective scheduling problem faced by the agent, and show how agents can design optimal bidding strategies to respond to price adjustment strategies from the auctioneer. We apply our approach to handle real-time large-scale dynamic resource coordination in a mega-scale container terminal.

PRESS: PeRsonalized Event Scheduling recommender System (Demonstration)

Adaptive Agents and Multi-Agents Systems, May 9, 2016

This paper presents a personalized event scheduling recommender system, PRESS, for a large confer... more This paper presents a personalized event scheduling recommender system, PRESS, for a large conference setting with multiple parallel tracks. PRESS is a mobile application that gathers personalized information from a user and recommends talks/demos to be attend. The input from a user include a list of keyword preferences and (optionally) preferred talks. We use the MALLET topic model package to analyze the set of conference papers and classify them based on automatically identified topics. We propose an algorithm to generate a list of recommended papers based on the user keywords and the MALLET topics. An optimization model is then applied to obtain a feasible schedule. The recommended set is matched against the selected papers by the user which we obtained from a survey conducted at AAMAS-15 in Istanbul, Turkey. We show that PRESS is able to provide reasonable accuracy, precision and recall rates. PRESS will be deployed live during AAMAS-16 in Singapore.

Robust Partial Order Schedules for RCPSP/max with Durational Uncertainty

Proceedings of the International Conference on Automated Planning and Scheduling, Mar 30, 2016

In this work, we consider RCPSP/max with durational uncertainty. We focus on computing robust Par... more In this work, we consider RCPSP/max with durational uncertainty. We focus on computing robust Partial Order Schedules (or, in short POS) which can be executed with risk controlled feasibility and optimality, i.e., there is stochastic posteriori quality guarantee that the derived POS can be executed with all constraints honored and completion before robust makespan. To address this problem, we propose BACCHUS: a solution method on Benders Accelerated Cut Creation for Handling Uncertainty in Scheduling. In our proposed approach, we first give an MILP formulation for the deterministic RCPSP/max and partition the model into POS generation process and start time schedule determination. Then we develop Benders algorithm and propose cut generation scheme designed for effective convergence to optimality for RCPSP/max. To account for durational uncertainty, we extend the deterministic model by additional consideration of duration scenarios. In the extended MILP, the risks of constraint violation and failure to meet robust makespan are counted during POS exploration. We then approximate the uncertainty problem with computing a risk value related percentile of activity durations from the uncertainty distributions. Finally, we apply Pareto cut generation scheme and propose heuristics for infeasibility cuts to accelerate the algorithm process. Experimental results demonstrate that BACCHUS efficiently and effectively generates robust solutions for scheduling under uncertainty.

Towards Finding Robust Execution Strategies for RCPSP/max with Durational Uncertainty

Proceedings of the International Conference on Automated Planning and Scheduling, May 25, 2021

Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max) h... more Resource Constrained Project Scheduling Problems with minimum and maximum time lags (RCPSP/max) have been studied extensively in the literature. However, the more realistic RCPSP/max problems-ones where durations of activities are not known with certainty-have received scant interest and hence are the main focus of the paper. Towards addressing the significant computational complexity involved in tackling RCPSP/max with durational uncertainty, we employ a local search mechanism to generate robust schedules. In this regard, we make two key contributions: (a) Introducing and studying the key properties of a new decision rule to specify start times of activities with respect to dynamic realizations of the duration uncertainty; and (b) Deriving the fitness function that is used to guide the local search towards robust schedules. Experimental results show that the performance of local search is improved with the new fitness evaluation over the best known existing approach.

Entropy Controlled Non-Stationarity for Improving Performance of Independent Learners in Anonymous MARL Settings

Efficient sequential matching of supply and demand is a problem of interest in many online to off... more Efficient sequential matching of supply and demand is a problem of interest in many online to offline services. For instance, Uber, Lyft, Grab for matching taxis to customers; Ubereats, Deliveroo, FoodPanda etc for matching restaurants to customers. In these online to offline service problems, individuals who are responsible for supply (e.g., taxi drivers, delivery bikes or delivery van drivers) earn more by being at the "right" place at the "right" time. We are interested in developing approaches that learn to guide individuals to be in the "right" place at the "right" time (to maximize revenue) in the presence of other similar "learning" individuals and only local aggregated observation of other agents states (e.g., only number of other taxis in same zone as current agent). Existing approaches in Multi-Agent Reinforcement Learning (MARL) are either not scalable (e.g., about 40000 taxis/cars for a city like Singapore) or assumptions of common objective or action coordination or centralized learning are not viable. A key characteristic of the domains of interest is that the interactions between individuals are anonymous, i.e., the outcome of an interaction (competing for demand) is dependent only on the number and not on the identity of the agents. We model these problems using the Anonymous MARL (Ay-MARL) model. To ensure scalability and individual learning, we focus on improving performance of independent reinforcement learning methods, specifically Deep Q-Networks (DQN) and Advantage Actor Critic (A2C) for AyMARL. The key contribution of this paper is in employing principle of maximum entropy to provide a general framework of independent learning that is both empirically effective (even with only local aggregated information of agent population distribution) and theoretically justified. Finally, our approaches provide a significant improvement with respect to joint and individual revenue on a generic simulator for online to offline services and a real world taxi problem over existing approaches. More importantly, this is achieved while having the least variance in revenues earned by the learning individuals, an indicator of fairness.

Learning and Exploiting Shaped Reward Models for Large Scale Multiagent RL

Proceedings of the International Conference on Automated Planning and Scheduling

Many real world systems involve interaction among large number of agents to achieve a common goal... more Many real world systems involve interaction among large number of agents to achieve a common goal, for example, air traffic control. Several model-free RL algorithms have been proposed for such settings. A key limitation is that the empirical reward signal in model-free case is not very effective in addressing the multiagent credit assignment problem, which determines an agent's contribution to the team's success. This results in lower solution quality and high sample complexity. To address this, we contribute (a) an approach to learn a differentiable reward model for both continuous and discrete action setting by exploiting the collective nature of interactions among agents, a feature commonly present in large scale multiagent applications; (b) a shaped reward model analytically derived from the learned reward model to address the key challenge of credit assignment; (c) a model-based multiagent RL approach that integrates shaped rewards into well known RL algorithms such as...

A Hybrid Framework Using a QUBO Solver For Permutation-Based Combinatorial Optimization

arXiv (Cornell University), Sep 27, 2020

Hardware limitation imposes a challenge to apply quadratic unconstrained binary optimization (QUB... more Hardware limitation imposes a challenge to apply quadratic unconstrained binary optimization (QUBO) / Ising model solvers directly to solve large combinatorial optimization problem instances. In this paper, we propose a hybrid framework to solve large-scale permutation-based combinatorial problems effectively using a QUBO solver efficiently and effectively. To achieve this, several issues need to be addressed. First, we convert a constrained optimization model into an unconstrained model by introducing a penalty coefficient. Second, to ensure effective search of good quality solutions, a smooth energy landscape is needed; we propose a data scaling approach that reduces the amplitudes of the input without compromising optimality. Third, parameter tuning is needed, and in this paper, we illustrate that for certain problems, it suffices to perform random sampling on the penalty coefficients to achieve good performance. Fourth, we address the challenges of using a QUBO solver that may only work on small scale problems due to hardware limitation and also due to general performance deterioration of a QUBO solver as the size grows; we introduce a decomposition approach that calls the QUBO solver repeatedly on small sub-problems. Finally, to handle the possible infeasibility of the QUBO solution, we introduce a polynomial-time projection algorithm. We show our performance on the Traveling Salesman Problem and Flow Shop Scheduling Problem. We empirically compare the performance of our approach with solving the optimization problems directly. We also compare the performance with an exact solver.

GRAND-VISION: An Intelligent System for Optimized Deployment Scheduling of Law Enforcement Agents

Proceedings of the International Conference on Automated Planning and Scheduling

Law enforcement agencies in dense urban environments, faced with a wide range of incidents to han... more Law enforcement agencies in dense urban environments, faced with a wide range of incidents to handle and limited manpower, are turning to data-driven AI to inform their policing strategy. In this paper we present a patrol scheduling system called GRAND-VISION: Ground Response Allocation and Deployment - Visualization, Simulation, and Optimization. The system employs deep learning to generate incident sets that are used to train a patrol schedule that can accommodate varying manpower, break times, manual pre-allocations, and a variety of spatio-temporal demand features. The complexity of the scenario results in a system with real world applicability, which we demonstrate through simulation on historical data obtained from a large urban law enforcement agency.

Multi-Agent Task Assignment for Mobile Crowdsourcing under Trajectory Uncertainties

Adaptive Agents and Multi-Agents Systems, May 4, 2015

In this work, we investigate the problem of mobile crowdsourcing, where workers are financially m... more In this work, we investigate the problem of mobile crowdsourcing, where workers are financially motivated to perform location-based tasks physically. Unlike current industry practice that relies on workers to manually browse and filter tasks to perform, we intend to automatically make task recommendations based on workers' historical trajectories and desired time budgets. However, predicting workers' trajectories is inevitably faced with uncertainties, as no one will take exactly the same route every day; yet such uncertainties are oftentimes abstracted away in the known literature. In this work, we depart from the deterministic modeling and study the stochastic task recommendation problem where each worker is associated with several predicted routine routes with probabilities. We formulate this problem as a stochastic integer linear program whose goal is to maximize the expected total utility achieved by all workers. We further exploit the separable structure of the formulation and apply the Lagrangian relaxation technique to scale up the solution approach. Experiments have been performed over the instances generated using the real Singapore transportation network. The results show that we can find significantly better solutions than the deterministic formulation.

Mechanisms for arranging ride sharing and fare splitting for last-mile travel demands

Adaptive Agents and Multi-Agents Systems, May 5, 2014

A great challenge of city planners is to provide efficient and effective connection service to tr... more A great challenge of city planners is to provide efficient and effective connection service to travelers using public transportation system. This is commonly known as the last-mile problem and is critical in promoting the utilization of public transportation system. In this paper, we address the last-mile problem by considering a dynamic and demandresponsive mechanism for arranging ride sharing on a nondedicated commercial fleet (such as taxis or passenger vans). Our approach has the benefits of being dynamic, flexible, and with low setup cost. A critical issue in such ride-sharing service is how riders should be grouped and serviced, and how fares should be split. We propose two auction designs which are used to solicit individual rider's willing payment rate and compensation rate (for extra travel, if any). We demonstrate that these two auctions are budget balanced, individually rational, and incentive compatible. A series of experimental studies based on both synthetic and real-world datasets are designed to demonstrate the pros and cons of our two proposed auction mechanisms in various settings.

Designing a Portfolio of Parameter Configurations for Online Algorithm Selection

Algorithm portfolios seek to determine an effective set of algorithms that can be used within an ... more Algorithm portfolios seek to determine an effective set of algorithms that can be used within an algorithm selection framework to solve problems. A limited number of these portfolio studies focus on generating different versions of a target algorithm using different parameter configurations. In this paper, we employ a Design of Experiments (DOE) approach to determine a promising range of values for each parameter of an algorithm. These ranges are further processed to determine a portfolio of parameter configurations, which would be used within two online Algorithm Selection approaches for solving different instances of a given combinatorial optimization problem effectively. We apply our approach on a Simulated Annealing-Tabu Search (SA-TS) hybrid algorithm for solving the Quadratic Assignment Problem (QAP) as well as an Iterated Local Search (ILS) on the Travelling Salesman Problem (TSP). We also generate a portfolio of parameter configurations using best-of-breed parameter tuning a...

Scalable Urban Mobile Crowdsourcing

ACM Transactions on Intelligent Systems and Technology, 2018

In this article, we investigate effective ways of utilizing crowdworkers in providing various urb... more In this article, we investigate effective ways of utilizing crowdworkers in providing various urban services. The task recommendation platform that we design can match tasks to crowdworkers based on workers’ historical trajectories and time budget limits, thus making recommendations personal and efficient. One major challenge we manage to address is the handling of crowdworker’s trajectory uncertainties. In this article, we explicitly allow multiple routine routes to be probabilistically associated with each worker. We formulate this problem as an integer linear program whose goal is to maximize the expected total utility achieved by all workers. We further exploit the separable structures of the formulation and apply the Lagrangian relaxation technique to scale up computation. Numerical experiments have been performed over the instances generated using the realistic public transit dataset in Singapore. The results show that we can find significantly better solutions than the determ...

Proceedings of the 14th Annual International Conference on Electronic Commerce - ICEC '12

Interacting Knapsack Problem in Designing Resource Bundles

Robust execution strategies for project scheduling with unreliable resources and stochastic durations

Journal of Scheduling, Mar 12, 2015

The Resource Constrained Project Scheduling Problem with minimum and maximum time lags (RCPSP/max... more The Resource Constrained Project Scheduling Problem with minimum and maximum time lags (RCPSP/max) is a general model for resource scheduling in many real-world problems (such as manufacturing and construction engineering). We consider RCPSP/max problems where the durations of activities are stochastic and resources can have unforeseen breakdowns. Given a level of allowable risk, α, our mechanisms aim to compute the minimum robust makespan execution strategy. Robust makespan for an execution strategy is any makespan value that has a risk less than α. The risk for a makespan value, M given an execution strategy is the probability that a schedule instantiated from the execution strategy will not finish before M given the uncertainty over durations and resources. We make three key contributions: (a) Firstly, we provide an analytical evaluation of resource breakdowns and repairs on executions of activities; (b) We then incorporate such information into a local search framework and generate execution strategies that can absorb resource and durational uncertainties; and (c) Finally, to improve robustness of resulting strategies, we propose Resource Breakdown Aware Chaining procedure with three different metrics. This chaining procedure computes resource allocations by predicting the effect of breakdowns on robustness of generated strategies. Experiments show effectiveness of our proposed methods in providing more robust execution strategies under uncertainty.

Fine-Tuning Algorithm Parameters Using the Design of Experiments Approach

Lecture Notes in Computer Science, 2011

Collaborative Urban Logistics – Synchronizing the Last Mile a Singapore Research Perspective

Procedia - Social and Behavioral Sciences, Mar 1, 2014

The synchronized last mile logistics concept seeks to address, through coordinated collaboration,... more The synchronized last mile logistics concept seeks to address, through coordinated collaboration, several challenges that hinder reliability, cost efficiency, effective resource planning, scheduling and utilization; and increasingly, sustainability objectives. Subsequently, the meeting of service level and contractual commitments are competitively impacted with any loss of efficiency. These challenges, against a backdrop of Singapore, can essentially be addressed in selected industry sectors through a better understanding of logistics structures; innovative supply chain designs and coordination of services, operations and processes coupled with concerted policies and supply chain strategies.