Aws2 1336

1
4 Multi-Objective objective oOptimization mModels for the rRenewal pPlanning of mMultiple aAsset
5 cClasses
6 Thomas Ying-Jeh Chen1,* [email protected];
7 Eric Wang2;
8 Nicole Pasch3;
9 Amin Ganjidoost4
10 1
Senior Data Scientist
11 Xylem Inc
12 8920 MD-108, Columbia, MarylandMD, 21045, USA.
13 2
Product Manager
14 Xylem Inc
15 870 Market St., San Francisco, CaliforniaCA, 94102, USA.
16 3
Client Solutions Manager
17 Xylem Inc
18 11850 Sears Street, Ste A, Livonia, MichiganMI, 48150, USA.
19 4
Drinking Water Decision Science – Manager
20 Xylem Inc
21 5055 Satellite Dr #7, Mississauga, ON, L4W 5K7, Ontario, CaliforniaCA, USA.
22
1
23 *
Thomas Ying-Jeh Chen, Xylem Inc, 8920 MD-108, Columbia, MD, 21045, USA.
24 EmailCorresponding author: [email protected]
25
26 Abstract
27 Managing the aging infrastructure of water distribution systems presents a challenge for many
28 utilities. With various asset types competing for limited dollars, designing an effective asset
29 management program is a resource allocation problem. Mobilizations of equipment and crew is a
30 significant cost (typically 2%–-10%) within any capital improvement program. Therefore, selecting
31 projects that target multiple asset classes together can reduce mobilization and help utilities stretch
32 their budgets further. This research presents a process for modelingmodelling the joint renewal
33 planning of multiple asset classes. The problem is framed as a dual-objective optimization, where
34 the selection of project areas aims to maximize lead service line removal and water meter changeout
35 together. A case study from a Midwest utility is presented, and empirical data suggests the dual-
36 objective approach effectively reduces duplicate interventions to in the same regions. Equity
37 considerations are also examined, where constraints are added to enforce system-wide project
38 selection. Results show that the sensitivity of the objective towards equity is dependent on the
39 underlying spatial distribution of the target asset itself, where uneven spread of the target asset
40 leads to greater negative impact on model performance.
41
42 KEYWORDS
43 aAsset mManagement
44 dDrinking wWater sSystems
2
45 oOptimization
46 rRisk aAnalysis
47
48 Article Impact Statement
49 A two-step method is presented for the replacement planning of water system assets located at
50 individual households. The method reduces mobilization cost by addressing multiple asset types
51 together.
52
53 1. INTRODUCTION
54 With various operational and regulatory objectives competing for the same dollars within a
55 municipal budget, designing an asset management program is a resource allocation problem. For
56 many utilities, having multiple asset types (mains, valves, meters, and service lines) that each needs
57 timely inspection and repair is placing a heavy burden on their limited budgets and affordability of
58 the water service in their communities. In addition, public health regulations imposed by state and
59 local agencies may require additional capital works to be performed, further stretching the staff
60 and funding of the municipality. As a result, a cost-efficient and effective asset management
61 program is critical for guiding utility operators to maximize their return on time and investment by
62 targeting vulnerable assets most in need first.
63 Risk assessments are useful in designing these programs because they provide a systematic process
64 for quantifying vulnerability and ranking individual assets to guide the prioritization (American
65 Water Works Association, 2010). For example, previous works have implemented analyses where
66 every distribution main is ranked based on the future likelihood and consequence of failure, and
67 asset management programs are designed to address the highest risk assets first (Chen, Riley, et al.,
68 2020; Chen, Washington, et al., 2020). See other examples in (Ganjidoost et al., (2022)), (Vladeanu
3
69 & and Matthews, (2019)), (Fontanazza et al., (2015)) , and (Puleo et al., (2014)) focusing on
70 distribution mains, valves, and water meters.
71 While this is a useful starting point, a key limitation of these approaches is that it only considers a
72 single asset class in the analysis. The assumption being that risk can be optimally reduced if capital
73 was allocated based on the highest- ranked assets. As discussed earlier, a water distribution system
74 contains many different infrastructure classes, and an asset management program that accounts for
75 the different categories is more effective. For example, identifying regions with high-risk water
76 mains alone is good, but identifying regions with high-risk mains, service lines, and valves together
77 can help the utility achieve greater economies of scale. The largest savings realized in this approach
78 is the reduction in truck roll, or mobilization cost. Truck roll is broadly defined as the dispatch of
79 crew and equipment into the field to perform any capital improvement work. It accounts for a
80 significant portion of the cost for any replacement project, with estimates of $18,000 per pipe
81 replacement projects requiring road excavations (Chen et al., 2021), and approximately $500 for
82 smaller projects at the household level (e.g., meter replacements). Depending on the crew and
83 equipment needed, mobilization costs take up anywhere between 2% and –-10% of the project's
84 overall budget, a significant margin for utilities with stretched budgets. Designing programs that
85 align the replacements of multiple infrastructure classes together will reduce the cost of truck roll
86 and help utilities do more with their limited budgets.
87 The objective of this research is to present a methodology for selecting household-level projects that
88 optimizes the joint replacement of two infrastructure classes: degraded water meters, and lead
89 service lines. The problem is framed as a dual objective integer program: (1) maximize the number
90 of lead service lines replaced, and (2) maximize the number of degraded meters replaced.
91 Individual homes are aggregated into larger areas and scored based on the number of lead pipes
92 and degraded meters contained in each. A single project is delineated based on these aggregated
93 spaces and the optimization aims to find a selection of project areas that jointly maximizes the
4
94 replacement of these assets. An exact formulation of this model (one that guarantees the
95 identification of all optimal solutions) is presented, and a solution approach is demonstrated to find
96 the pPareto-efficient frontier (Berardi et al., 2009Dridi et al., 2009). Each point on this frontier
97 represents an optimal solution based on different weightings of the two objectives. Utilities can
98 choose between these solutions based on overall organizational and regulatory priorities.
99 Two main contributions of this research are summarized below.as follows:
100  The paper presents a methodology for optimizing the joint replacement planning of two
101 infrastructure classes that are located at the premise of individual households. Single
102 structures can be dispersed geographically and has have the potential to quickly drive-
103 up mobilization costs. Based on our review of the literature, no previous work has
104 demonstrated the application of optimization modelingmodelling and geospatial
105 methods for the joint renewal planning of assets at the individual household resolution.
106 While this research considers water meters and lead service lines as the case study, the
107 methods presented can be generalized to plan for the asset management of any
108 infrastructure found inside the home.
109  An exact mathematical model, one that can guarantee to find all optimal solutions, is
110 presented and framed as a dual-objective optimization problem. The output for these
111 classes of problems is a set of pPareto-efficient solutions representing different tradeoffs
112 between the two objectives. We find no previous work presents an exact model for joint
113 (multi-asset) project selection of water systems. Providing multiple solutions to
114 decision- makers is also an improvement because it can guide more effective planning
115 based on different organizational and regulatory priorities.
116 Together these two contributions advance the state of the art in modelingmodelling to assist in better
117 asset management planning. It is important to note that the case study of synthesizing these two
5
118 activities together is meant for conceptual purposes only. The goal here is to provide an example of
119 two infrastructure projects that both take place at individual structures, and the case study
120 demonstrates the application of dual objective modelingmodelling to select the best project portfolio.
121 In practice, considerations beyond just mobilization cost must be included when selecting which
122 infrastructure projects should be aligned. Infrastructure projects requiring similar crew size,
123 construction activities (need for specialized equipment), and level of disruption (excavation and
124 restoration) are much better suited for alignment. Meter and service line replacements are different
125 in their required level of effort, but these are selected due to data availability at the single
126 household resolution and best supports methods explored in this research.
127 2. LITERATURE REVIEW
128 Many studies are available for the risk assessments of water systems focusing on a single asset class.
129 Water distribution mains have been a particular focus for many researchers (see reviews by
130 (Kleiner & and Rajani, (2001); , Konstantinou & and Stoianov, (2020); , and St. Clair & and Sinha,
131 (2012))), where statistical models were developed to estimate risk for individual pipeline segments
132 and future projects could then be planned by prioritizing the highest risk assets first. The
133 underlying models rely on historic records of asset failure to train and validate the predictions for
134 which assets will fail in the future. On the other hand, see literature reviews by (Tscheikner-Gratl
135 et al., (2019)) and (Ana & Ana and Bauwens, (2007)) that summarizes the state of the art in sewer
136 main risk modelingmodelling. Sewer main risk assessments primarily (1) quantify the structural
137 condition of the main, (2) estimate the degree of overflow during wet weather events, and (3)
138 characterize the fiscal and environmental impacts of overflows and leaks. Similar works have also
139 been performed that address water meter changeout programs (see works by (Fontanazza et al.,
140 (2015); , Yazdandoost & and Izadi, (2018); , Yee, (1999); , and Mohanakrishnan et al. (2019)), where
141 quantitative models to characterize the reliability of water meters are presented. The meters most
6
142 likely to fail or inaccurately record hydraulics can be addressed first in a replacement program to
143 maximize overall reliability.
144 Taking on a broader view of infrastructure renewal planning, previous works have also discussed
145 the benefits of bundling multiple infrastructure projects together in the same location. The major
146 benefit of selecting projects in this manner is to cut down on mobilization costs and realize greater
147 economies of scale (S. Kerwin & Adey, 2020a). In the context of water systems, works by (Kleiner et
148 al., (2010) and; Nafi and Kleiner (2010)) present a methodology for selecting pipe replacement
149 locations that are incentivized to align with future road-work locations. An objective function for
150 total cost is presented, which includes mobilization, raw materials, crew, and expected repair of
151 future breaks, and the optimization aims to minimizes total cost. Mobilization costs are greatly
152 reduced for streets with future road works already planned, such that selections that align with
153 adjacent works are incentivized. Similar research by (Sean Kerwin & and Adey, (2020b; ),
154 Muñuzuri et al., (2020), and; Tscheikner-Gratl et al., (2016)) demonstrate approaches for bundling
155 water and sewer main replacement selections together, where a risk index that encompasses the
156 status of both asset classes is used to guide decision making. A case study by (Carey & and Lueke,
157 (2013)) presents a framework for selecting infrastructure projects based on the combined criticality
158 of the underlying roads, water mains, and sewer mains. A major limitation of these approaches is
159 the need to assign weightings between the asset classes and the difficulty in normalizing failure
160 outcomes (e.g., failure cost to a road collapse is much larger than a pipe burst). Multi-criteria
161 decision-making methods such as AHP and ELECTRE have been demonstrated to reflect these
162 weightings from an operators perspective (Tscheikner-Gratl et al., 2017), but they are often subject
163 to human bias and difficult to scale. Despite the challenges posed for by the application of these
164 planning models, they can still be used in conjunction with the dual-objective model developed in
165 this research to identify areas with greater project alignment. The model developed in this research
166 focuses on infrastructure at the same spatial resolution (i.e., homes), but this can be incorporated
7
167 with other planning models to consider other assets of different spatial scales (i.e., streets,
168 neighborhoods).
169 Optimization modelingmodelling is a useful tool for ingesting asset- level or location- level risk
170 indices and formulating a program that maximizes risk reduction. The output from these models
171 often specify specifies which assets to select, when to address them, and even the type of action to
172 take (replace or repair) (L. Chen et al., 2019). Some examples that formulate and solve exact integer
173 models include the following: inspection routing (Chen, Riley, et al., 2020; Chen, Washington, et al.,
174 2020), replacement project selection (T.Y.J. Chen et al., 2021), sewer rehabilitation (de Monsabert
175 et al., 1999), as well as sensor placement (Berry et al., 2005). Since these models are convex, the
176 globally optimal solution can be identified. To add further complexity, overall pressure and flow
177 conditions need to be considered when taking parts of a distribution system offline during projects.
178 This can introduce non-linear constraints which impact computation tractability, as seen in
179 examples by (Pecci et al., (2015)) and (Naoum-Sawaya et al., (2015)). Here non-linear relaxation
180 techniques are demonstrated to convergence on local optima. Multi-objective formulations for
181 water main replacement planning is are presented in (Dandy & and Engelhardt, (2006); , Kim et al.,
182 (2004); , and Osman et al., (2017)), the model formulations in these works aim to maximize risk
183 reduction along with other considerations such as cost, hydraulic reliability, and traffic impacts.
184 The work presented in this paper aims to extend the state of the art by formulating a multi-
185 objective optimization model for the replacement of multiple water infrastructure assets at the
186 location of individual households. From our review of the literature, we find no work that jointly
187 addresses capital replacement planning at this spatial resolution as well as the economies of scale
188 when bundling multiple replacements at the same location. Another area where previous work can
189 be integrated with the proposed model is during the formulation of the dual-asset replacement
190 model. Because the proposed method relies on inputs to determine the reward for selecting a given
191 asset, this makes it particularly suitable for using the existing state of the art as input. In this paper,
8
192 we use the age of the valves and a predictive model for lead service line locations to generate inputs
193 to the project selection model. However, in practice, more sophisticated methods that exist in the
194 literature can be used to provide better- quality estimates of infrastructure conditions.
195
196 3. METHODOLOGY
197 This section outlines the two-step process for (1) aggregating adjacent households into larger areas
198 more suitable for capital projects, and (2) identifying the collection of areas that maximizes the
199 joint renewal of two different water infrastructure assets. This research will consider the
200 replacement planning of lead service lines and degraded water meters since both are located at the
201 premise of individual homes. The method can be generalized to consider any two asset classes, but
202 other practical planning factors would need to be considered (i.e., project cost and time). Figure 1
203 below summarizes the two-step process.
204
205
206 FIGURE 1. Project Identification identification and Optimization optimization Workflowworkflow.
207
208 3.1. Problem sSpecification
209 The project area selection problem is similar to the knapsack problem (Martello & Toth, 1990), a
210 well-studied problem in the field of combinatorial optimization. The goal is to select the
211 combination of project areas that will be targeted for replacement (lead service lines and water
212 meters) in the upcoming capital renewal program. The objective is to identify the selection of
213 project areas that maximize the number of lead service lines and degraded meters replacements. A
9
214 project area is simply an aggregation of addresses that are nearby to each other typically in a
215 contiguous manner. The budget and resource limitations of the municipality are reflected by
216 placing a ceiling on the total number of homes that can be included in the replacement program.
217 This assumes that the cost for of addressing each home is uniform, it is possible to include more
218 complex cost models but that is left for future work.
219 3.2. Spatial aAggregation of hHouseholds
220 Each residential structure served by a utility can be delineated by their its land parcel, and adjacent
221 parcels can be aggregated into larger geographic areas for project planning purposes. This is done
222 by loading the shapefiles of individual land parcels into the ArcGIS spatial software, using a spatial
223 snap function to join adjacent land parcels together if needed, then merging all adjacent land
224 parcels into a larger area. See Figure 2 below for an example output of this spatial analysis. This
225 approach is selected due to its simplicity in implementation and effectively groups of individual
226 homes into larger areas for project selection. The spatial process is also consistent where most
227 aggregated street-blocks contain similar number of homes (approximately ca. 30 parcels per street
228 block). In practice, the methodology to bundle individual homes can be generalized to any
229 approach the utility sees most appropriate, e.g.for example, aggregating homes along both sides of
230 the same street. It is possible that other aggregation methods may be more realistic, but the selected
231 method described here meets the needs of this research.
232
233 FIGURE 2. Spatial Aggregation aggregation of Land land Parcels parcels to Street street Block block
234 Neighborhoodsneighborhoods.
235
10
236 For further simplicity, only residential structures are considered for this research since they
237 comprise most of the buildings served by a municipality and are the primary target for lead service
238 line removal and meter changeouts. Each land parcel is counted as a single residential structure
239 and assumed to contain one 1 meter and one service line (Hajiseyedjavadi et al., 2022). The
240 aggregated street blocks will have an associated (1) count of the total number of homes, (2) count of
241 homes with lead service lines, and (3) count of homes with degraded meters.
242 3.3. Neighborhood sSelection oOptimization
243 The output from the spatial aggregation is to group individual addresses located near each other
244 into larger areas that better delineate potential project areas. The next step is to select the collection
245 of project areas that will be included into the asset management program. Due to limited capital
246 and labor resources, utilities need to prioritize neighborhoods to maximize return on investment
247 and best meet regulatory and operational objectives. Budget limitations at the utility is are reflected
248 through a limit on the total number of households that can be included. The reward for selecting a
249 project area is defined as (1) the expected sum of lead service lines contained in the boundary
250 (verified plus unverified) and (2) the number of degraded meters contained in the boundary based
251 on age. On the other hand, the cost for of selecting a project area is defined as the total number of
252 structures contained in the boundary. The exact integer program for jointly optimizing the
253 replacement of lead service lines and degraded meters is defined.
254 3.3.1. Decision vVariables
255 We first define the following decision variables for the optimization model.
256  Let i ∈ I be the index of candidate project areas.
257  Let X i = 1 if the candidate project area i is selected to be included in the capital
258 replacement plan, 0 otherwise.
11
259  Let Li = the expected number of homes with lead or galvanized service lines within the
260 candidate project area i . This is taken as the sum of homes where the pipe material is
261 verified (known based on historic inspection) and individual likelihoods of containing
262 lead if unverified (derived from a statistical model).
263  Let M i = the expected number of degraded meters contained within the project area i .
264 This is taken as the sum of the probability of meter failure across all homes contained
265 within the project area, the probabilities are derived from a statistical model.
266 Let H i = the total number of residential structures within the project area i .
267  Let T U and T L be the upper and lower limit for total residential structures that can be
268 addressed within the planning cycle. This reflects the utilities allocated budget for the
269 capital program.
270 For budget planning purposes, a municipality may need to submit a 3–10-year replacement plan to
271 the city manager for approval. The limits for the total number of homes targeted in the capital
272 renewal program (T U and T L) should reflect the available equipment and labor at the municipality.
273 For this research, we assume that a utility can target 1%–-3% of the total homes served within the
274 distribution area each year.
275 3.3.2. Problem fFormulation
276 The dual objective optimization model for selecting project areas that jointly maximizes the sum of
277 both lead service lines and water meters is specified in model (1) below.as follows:
278 max ∑ Li X i (1a)

i ∈I
279 max ∑ M i X i, (1b)

i ∈I
12
280 Subject to:
281 T L ≤ ∑ H i Xi ≤ T U, (1c)
i∈ I
282 X i ∈ {0 ,1 } , ∀ i ∈ I (1d)
283 The first objective function (1a) maximizes the total count of lead service lines replaced in the
284 selected project areas. The second objective function (1b) maximizes the total count of degraded
285 meters addressed. Constraint (1c) specifies that the total number of households in the selected area
286 is between the upper and lower limits. Constraint (1d) specifies the binary domain of the decision
287 variable. Note that equations (1a) – –(1d) form the basis of the project selection problem and are
288 closely similar to the knapsack problem (Martello & Toth, 1990). Solving this model can serve as a
289 good starting point for planning purposes, however, there are many practical and political
290 limitations not accounted for. We will address two common examples here and demonstrate how
291 these considerations can be included in the following model: (1) requiring a minimum number of
292 projects planned for each neighborhood or political boundary, and (2) the desire for selecting
293 project areas that are as spatially compact as possible.
294 When allocating resources for lead service line replacements, there is often political pressure to
295 ensure that there is system- wide coverage during the capital program (Madrigal, 2019). However,
296 it is well documented that at-risk populations for lead exposure are not evenly distributed across a
297 city, often being concentrated in specific neighborhoods (Abernethy et al., 2016Chojnacki et al.,
298 2017). As a result, a more equitable program will allocate more resources towards neighborhoods
299 with at-risk individuals, while still satisfying political pressures by ensuring projects are distributed
300 across all neighborhoods. The following constraints will enforce a minimum count of homes
301 selected in every neighborhood. For generalizability, will use the terms ‘“neighborhood,’”, ‘“ward’
13
302 ward,” and ‘“boundary’ boundary” interchangeably, in practice, any geographic delineation of the
303 distribution area can be used.
304  Let j ∈ J be the index of all neighborhood boundaries within the distribution area.
305  Let T j = the minimum number of residential structures within the neighborhood
306 boundary j required to be included in the replacement plan.
307 Let D j = the set of candidate project areas i that are contained within the boundary j .
308 Note that the neighborhood-specific limits on houses selected need to correspond with the overall
309 limits across the entire distribution area: T L ≤ ∑ T j ≤ T U . The following constraint (1e) can be
j ∈J
310 included in model (1) to enforce minimum selection threshold per neighborhood. Equation (1e)
311 specifies that the total number of homes selected within each neighborhood is at least the minimum
312 required amount.
313 ∑ H i Xi ≥ T j, ∀ j ∈ J (1e)
i ∈D j
314 Within each neighborhood, selecting projects that are as spatially compact as possible is desirable
315 because of the following: (1) further reduces mobilization since less driving is required to address all
316 the individual homes, (2) simplifies routing of the crew to address all the selected projects since they
317 are close together. To account for compactness, we specify a maximum distance that cannot be
318 exceeded between any two project areas within a given neighborhood. To include considerations of
319 compactness in model (1), we first define a few additional variables.
320 Let ε j be the set of all possible project pair combinations across the boundary j .
321  Let Y ij= 1 if both project areas i and j are selected for replacement, 0 otherwise.
322  Let D ij = the distance between the project areas i and j , defined as the euclidean
323 distance between the two centroids.

14
324 Let B j the maximum distance allowable between any selected pair of projects within the boundary
325 j.
326 The following constraints added to model (1) will adjust the model to consider the degree of spatial
327 spread of the selected project areas.
328 D ij Y ij ≤ B j , ∀(i , j) ∈ ε j , j ∈ J (1f)
329 Y ij ≤ X i , ∀ (i , j)∈ ε j , j ∈ J (1g)
330 Y ij ≤ X j , ∀(i , j)∈ ε j , j ∈ J (1h)
331 Y ij ≥ X i + X j−1 , ∀(i , j)∈ ε j , j ∈ J (1i)
332 Y ij ∈ {0 , 1} , ∀(i , j) ∈ ε j , j ∈ J (1j)
333 Equation (1f) enforces that the centroid distance between any selected pair of projects must not
334 exceed the boundary-specific limit B j . Equations (1g) – –(1i) enforces the relationship between the
335 indicator variables: Y ij can only be 1 if both X i and X j are also 1. Equation (1j) specifies the binary
336 domain of the decision variable.
337 3.3.3. Solution mMethod
338 Model (1) is a dual objective model, meaning the optimal solution is not a single unique selection of
339 neighborhoods, but rather a set of solutions that are pPareto-efficient (or non-dominated). A set of
340 solutions that are pPareto-efficient represents the optimal combination of outcomes where any
341 improvement to objective (1a) will come at the expense of (1b), and vice versa. Pareto optimality
342 enables all tradeoffs among optimal combinations of the two objectives to be considered (Muncie et
343 al., 2013).
15
344 To solve model (1) and identify the pPareto-efficient frontier, this research considers the epsilon
345 constraint approach since the closed form specification is available. We refer the reader to (Haimes
346 et al., 1971) for full details on the epsilon constraint method, as well as Figure 3 below. To
347 summarize, it involves first solving the model as single objective problem by considering only (1a)
348 and (1b) alone. These two solutions initialize the pPareto-efficient set by defining the boundaries.
349 The algorithm then iterates through the solution space between the two boundary points to identify
350 all other pPareto-efficient solutions that may exist. This is done by converting one of the objective
351 functions as a constraint, making the model a single-objective problem, and resolving the
352 optimization at different threshold values of the converted objective function. This process is
353 repeated for every incremental value of ε, with each newly detected non-dominated solution being
354 appended to the pPareto-efficient set.
355
356 FIGURE 3. Schematic for Epsilonepsilon-Constraint constraint Algorithm algorithm Dualdual-
357 Objective objective Maximization maximization Problemsproblems.
358
359 Since model (1) is binary (all decision variables are binary) and linear, each iteration of the epsilon
360 constraint method can be solved directly by using the branch and bound algorithm (Lawler &
361 Wood, 1966) available on most commercial and open-source solvers. The spatial data of the case
362 study was preprocessed using ESRI'’s ArcGIS software; all data processing, model formulation,
363 and implementation of the epsilon constraint algorithm was implemented with Python 3.7 and the
364 package PuLP; the mathematical solver CPLEX 12.10.0 was used to identify the optimal solutions.
365
366 4. CASE STUDY
16
367 The two-step methodology to aggregate individual households and optimize the selection of projects
368 is demonstrated on in a real municipality. In this section, we describe the case study dataset and the
369 methods used to generate the necessary inputs for the project optimization: (1) lead service line
370 estimates, (2) failed meter estimates, and (3) location of larger neighborhoods.
371 4.1. Distribution sSystem dData
372 We partnered with the local utility in Dearborn (Michigan) to obtain spatial databases of the city'’s
373 water distribution system and parcel tax assessment information. The water meter spatial layer and
374 tax parcels are used to identify the set of active residential users. We first use tax assessment data
375 from the year 2021 to filter out all buildings with a non-residential zoning classification (e.g.,
376 commercial, industrial, federal). Next, we spatially relate each meter to a land parcel based on its
377 location, then using the customer status in the meters shapefile we filter out locations where the
378 meter is inactive. There are a total of 29,559 residential parcels served by the Dearborn water
379 system, and we assume that each parcel contains one 1 meter and one building. There are a total of
380 2074 unique candidate project areas in the City of Dearborn after aggregating adjacent land
381 parcels to street blocks. Figure 4 below shows a map of all the residential land parcels served under
382 the distribution system, along with a map of aggregated street blocks colored by the number of
383 parcels contained within each boundary.
384
385 FIGURE 4. Dearborn Residential residential Land land Parcel parcel and Street street Block block
386 Locationslocations.
387
388 4.2. Lead sService lLine pProbability
17
389 We consider the service line assets running from the distribution main to the household in this
390 study. The portion of pipe between the water main and the stop box, curb stop, or shutoff valve is
391 publicly owned by the City of Dearborn, and the rest of the pipeline running to the meter inside the
392 home is owned by the homeowner. While there are two portions of pipe making up the service line
393 connection, we consider the prevalence of lead as a binary response: does any part of the pipeline
394 contain lead, or not. A ‘lead’ response in the data is encoded by the utility as ‘“any portion lead,’”,
395 whereas a ‘“non-lead’ lead” response is encoded as ‘“neither portion lead.’”. We take this approach
396 considering both the privately and publicly owned portion together because the incidence of lead on
397 either part of the pipe necessitates a full replacement work based on the latest regulation (US
398 Environmental Protection Agency, 2019). Therefore, the count of lead service lines across the entire
399 system can be assumed as the count of meter boxes connected to lead pipes.
400 The material information for the service line inventory is only partially complete for the City of
401 Dearborn. Of the 29,559 active residential land parcels under consideration, only 11,692 (39.6%)
402 contain material information which is verified based on historic inspection of replacement works.
403 For this research, we use the data from the verified portion of services lines to train machine
404 learning models to predict which unverified location is most likely to contain lead. Based on past
405 research results (Chojnacki et al., 2017) it is demonstrated that the XGBoost (T. Chen & Guestrin,
406 2016) algorithm is a strong predictor of lead service line locations, which we will use in the case
407 study here. We obtained from the City of Dearborn parcel tax assessment records to use for
408 modelingmodelling, in combination with attribute information embedded within the service line
409 shapefile. The tax assessment dataset identifies each parcel of land under the city'’s jurisdiction and
410 includes information on land value, building value, building age, and other relevant information on
411 building construction. See Table 1 below summarizing the input data used for training the XGBoost
412 algorithm.
18
413 TABLE 1. Machine Learning learning Variables variables for Lead lead Service service Line line
414 Predictionprediction.
Variable nName Description
Lead Binary Response Rresponse Variablevariable: ‘“Positive’ Positive” is the
Responseresponse meter box is located to a lead pipe, ‘“Negative’ Negative” otherwise.
Diameter Size of the service line pipe connected to the meter box, reported in
inches.
Install Yearyear Install year of the service line.
Parcel Ageage Built year of the residential structure connected to the meter box.
Parcel Floor floor Total floor area of the residential structure connected to the meter box,
Areaarea measured in square feet.
Parcel Total total Total square footage of the parcel in which the meter box is located in.
Areaarea
Parcel Land land Assessed value of the land in the parcel, based on 2021 tax data.
Valuevalue
Parcel Total total Total value of the parcel, including the land and the structure, based on
Valuevalue 2021 tax data.
415
416 The trained XGBoost model is a classification model that predicts for each unverified meter box,
417 the likelihood of having a lead service line connected there. To estimate the total number of lead
418 service lines that can be removed when selecting a given project area, sum the number of verified
419 locations with lead pipes with the probabilities of lead pipe for each unverified location. The
420 distribution of estimated lead service lines per street block is shown in Figure 5 below.
421
19
422
423 FIGURE 5. Expected Lead lead Service service Line line Capture capture per Street street Blockblock.
424
425 4.3. Probability of mMeter fFailure
426 To characterize the conditions of the water meters, an age-based likelihood of failure model is
427 implemented. It is beyond the scope of this research to use the most sophisticated modelingmodelling
428 of water meter risk, our goal here is to have a method to estimate the number of prevented meter
429 failures when selecting a replacement project area. A failure likelihood model is convenient here
430 because it is scaled between 0 and 1, and higher likelihoods directly translate to a higher risk value.
431 The probability model we use is presented in the case study by (Lund, 1988), which uses an
432 exponential distribution to characterize failure likelihood. The exponential model estimates the
433 probability of failure of an asset P ( t ) , where t denotes the asset age measured in years. The model
434 equation (2) is presented below.as follows:
( −0.01t )
435 P ( t ) =1−e (2)
436 Equation (2) specifies that older meters are more likely to fail. The in-service date of individual
437 meters is available as an embedded attribute in the water meter shapefile. We can use this
438 information to compute the age of each active residential water meter in years ( t ¿ . To estimate the
439 total number of failed meters that can be avoided when selecting a given project area, simply sum
440 the probability of failures of all the individual meters contained in the area. The distribution of
441 estimated failed meters per street block is shown in Figure 6 below.
442
443
20
444 FIGURE 6. Expected Meter meter Failure failure Count count per Street street Blockblock.
445
446 4.4. Geographic nNeighborhoods
447 Census tracts are used to delineate the larger neighborhoods which different street blocks are
448 contained within. They were selected because they are contiguous spaces that each roughly
449 encompass the same population (1200– - 8000 people) and are large enough to divide the service
450 area of Dearborn into a small number of discrete regions. The spatial database of census tracts are
451 is publicly available from the US Census Bureau (US Census Bureau, 2019). Figure 7 below shows
452 the boundary of each census tract overlaid with the street blocks. There are 24 unique census
453 tracts, with an average of 86 street blocks being contained within each area.
454
455
456 FIGURE 7. Census Tract tract (Neighborhoodsneighborhoods) Locations locations for the City of
457 Dearborn.
458
459 5. RESULTS AND DISCUSSION
460 In this section, we present the project area selection results for the City of Dearborn case study. To
461 demonstrate the application of optimization models to incorporate varying degrees of practical
462 planning constraints, three versions of model (1) will be considered.
21
463 (1) 1) Baseline: This model consists of only Eequations (1a) – –(1d). The only constraint is the
464 budget on the number of homes that can be selected as part of the capital program. No
465 geographic restrictions are specified.
466 (2) 2) Geographic Minimums: This model includes equation Equation (1e) to the baseline. This
467 specifies a minimum number of homes that must be selected per census tract to be included
468 for replacements. In effect, this avoids spatial concentration of projects and ensures there
469 will be selections spread across the entire distribution area.
470 (3) 3) Geographic Minimums and Compactness: This is the full model described by eEquations
471 (1a) – –(1j). Beyond simply enforcing a minimum number of homes per census tract, an
472 additional constraint requires that the selected blocks per area all be within a certain
473 distance of each other. In effect, this requires the model to identify a cluster of street blocks
474 to address within each census tract to reduce mobilization.
475 The epsilon constrain method was implemented to identify the pPareto-efficient frontier for each
476 model and the results were compared. The three models progress in complexity, with more
477 constraints being added each time, and thus worsens the overall performance of the model with
478 each step (fewer total assets removed). By contrasting the different solutions identified by each
479 model, we can also quantify the tradeoffs of including each consideration. This is done be by
480 measuring the degree to which the objectives worsen (how many fewer lead service lines were
481 removed, how many fewer failed meters replaced) at the expense of having a more practical and/or
482 low-cost deployment of resources (selected areas cover the whole site, all areas are compact).
483 For simplicity, we only considered the threshold where a maximum of 3% of parcels can be
484 selected. This roughly corresponds to a 3-year planning horizon for the city assuming a 1% annual
485 replacement rate. In practice, this threshold can be adjusted to the proportion of homes the utility
486 is planning to include in the capital program. The 3% limit corresponds to roughly 886 parcels total
487 over the street blocks selected, it is assumed that every home in a chosen street block will be
22
488 included in the capital program. Similarly, we specify that a minimum of 30 homes should be
489 selected per census tract for the geographic minimums constraint (1d)) and an upper limit of 1000
490 feet of separation between street blocks to enforce compactness in constraints (1f)) – –(1j)). Again,
491 these thresholds were selected as feasible and realistic values in a capital program and are used to
492 highlight the use of multi-objective models for capital planning. In practice, these model
493 parameters can be adjusted to accurately reflect local situations.
494 Figure 8 shows the identified pPareto-efficient frontiers for each of the solved models, the solution
495 located at the midpoint of the frontier is bolded. The X-axis shows the optimal value of objective 1
496 (lead service line removal) and the Y-axis shows the optimal value of objective 2 (failed meter
497 removal). From observation, it is evident that the baseline model with no geographic constraints far
498 out-performs the other two models, as seen in the large gap of its pPareto frontier relative to the
499 others. There are 36 identified solutions in the pPareto-efficient frontier of the ‘“baseline’ baseline”
500 model, 25 solutions in the ‘“geographic minimums’ minimums” model, and 13 solutions in the full
501 ‘“geographic minimums and compactness model.”’. The solution set of each model will be discussed
502 individually below before a comparison across models.
503
504
505 FIGURE 8. Pareto Efficient efficient Frontier frontier—– Objective Value value Comparison
506 comparison with Different different Constraints constraints (midpoint solution bolded).
507
508 We first focus on the ‘“baseline’ baseline” model. The boundary points along the pPareto-efficient
509 frontier represent the outcome when only one objective is considered in the optimization. The left-
510 most point is the solution when only failed meter removal is considered (equation Equation 1b)) and
23
511 the right-most point when only accounting for lead service lines (equation Equation 1a)). Optimizing
512 for lead service lines alone will produce a selection of street blocks that remove 823 lead pipe and
513 528 failed meters and optimizing for meters alone will remove 687 lead pipe and 571 failed meters
514 respectively. To quantify the difference in the selected street blocks, we can consider the two
515 optimal solutions as unique sets of street blocks and use the jJaccard similarity index. The jJaccard
516 index simply measures the degree of overlap between two sets, scaled between 0– and -1, defined as
517 the proportion of overlapping items relative to the total count of unique items (see equation (3)
518 below). Typically, jJaccard index values above 0.6 represent similar sets, and values below 0.4
519 represent dissimilar sets. The two boundary solutions have a jJaccard similarity of 0.39, illustrating
520 the large difference in selected areas when only considering one objective alone.
Number of verlapping tems cross wo ets ( A ∩ B )

521 Jaccard et imilarity ndex , J ( A , B )= (3)
Total umber of tems cross wo ets ( A ∪B )
522 Based on the two boundary points, the decrease in lead pipe capture (136, 16.5% lower) is a lot
523 larger than failed meters (43, 7.5% lower). This implies that the tradeoff for improving objective 2
524 comes at a larger expense of objective 1, as objective 1 is more sensitive to performance decrease.
525 The reasoning behind the large tradeoff of objective 1 (lead pipes) relative to objective 2 (meters) is
526 intuitive after comparing the spatial distributions of these assets in Figures 5 and 6 The location of
527 lead service lines is spatially concentrated in just a few neighborhoods of the distribution area,
528 whereas faulty meters are more evenly scattered throughout the system. Meaning any deviation
529 away from the regions with lead pipes are found will greatly decrease model performance, whereas
530 many different geographic configurations of street blocks can produce similar meter removal. The
531 mid-point of the pPareto frontier is a proxy for the outcome when the two objectives are evenly
532 weighted, in this scenario the selection captures 768 lead pipes and 562 bad meters. Here the
533 performance tradeoffs between the two objectives are much smaller: only 55 (6.7%) fewer lead
534 pipes selected compared to the highest value possible, and 8 (1.4%) fewer for bad meters. Taking
24
535 the jJaccard similarity index into account, the mid-point solution has a jJaccard similarity of 0.571
536 to the case where only lead pipes are optimized for and a similarity index of 0.703 in the case of only
537 meters.
538 To provide practical context, given that we assume there is 1 meter and 1 service line per
539 household, any difference in count of asset removal is an approximately the number of additional
540 home visits the utility crew must make. For example, if one solution captures 100 fewer lead service
541 lines, then to make up that difference the city must allocate an additional 100 removals at other
542 homes to make up the difference. Therefore, the mid-point along the pPareto frontier represents a
543 more cost-efficient program since it greatly reduces the truck roll needed to remove a high number
544 of both lead pipes and meters at the same time. If we only considered the single-objective outcomes,
545 a utility would have to visit up to 136 additional homes to achieve optimal removal of lead pipe,
546 versus just 55 additional homes if the mid-point outcome was used. Similarly, a utility would need
547 43 additional deployments at homes relative to just 8 additional for bad meters.
548 Focusing on the ‘“geographic minimums’ minimums” model next, when lead service line removals
549 are optimized alone the optimal selection of street blocks captures 660 lead pipes and 507 bad
550 meters. In contrast, when only faulty meter removal is optimized, the solution street blocks the
551 removal of 469 lead pipes and 548 548 meters. The two solutions have a jJaccard similarity of 0.194,
552 meaning that less than one- fifth of the solution street blocks overlap.
553 Like the patterns observed in the ‘“baseline”’ model, the removal of lead service lines is much more
554 sensitive to drops in performance relative to the removal of faulty meters. The tradeoff between the
555 two boundary points are is 191 (28.9%) fewer lead pipes in exchange for 41 (7.6%) additional
556 meters, and vice versa. The mid-point solution along the pPareto-efficient frontier results in a
557 selection of blocks that removes 603 (57, 8.6% less than the optimal) lead pipes and 535 (13, 2.3%
558 fewer than the optimal) failed meters. The jJaccard similarity of this selection to that when only
559 lead pipes are optimized for is 0.355 and 0.476 compared to the solution optimizing only meters.
25
560 This represents a significant improvement since the mid-point encompasses over a third overlap to
561 the lead-optimized solution and almost a half overlap the meter—optimized one. The reasoning
562 behind the difference in sensitivity is also the same, lead pipes are highly concentrated in just a
563 small handful of census tracts. By reweighting the model to focus more on bad meters, which are
564 much more evenly dispersed across the map, will lead to large tradeoffs in the number of removed
565 lead pipes. Translating from the objective values to crew mobilizations, the mid-point solution can
566 potentially save the city 131 home mobilizations to the removal of lead pipe and 28 deployments to
567 remove meters in order to achieve optimal removal.
568 Finally turning to the ‘geographic minimums and compactness”’ model, the boundary points along
569 pPareto frontier indicate that a lead-optimized solution will select street blocks that remove 627
570 lead pipes and 501 meters. The meter-optimized solution, in contrast, will select street blocks that
571 remove 469 lead pipes and 532 532 meters. This accounts for a 158 (25.2%) decrease in the lead
572 capture and 31 (5.8%) decrease in failed meter removal when comparing the two single-objective
573 solutions. The jJaccard similarity index between them is just 0.176. Turning to the mid-point
574 solution, the block selections here can remove 603 lead pipes and 519 bad meters. This again
575 represents a significantly smaller reduction between the two boundary points, only 24 (3.8%) less
576 than the lead-optimized count and 13 (2.7%) less than the meter-optimized count. It is interesting to
577 note here that, given blocks with high lead service lines and bad meter counts are already
578 geographically clustered, the compactness constraint does not deviate the solution performance by
579 a big margin. To translate the changes in objective values to potential crew mobilizations, the mid-
580 point solution can save the city up to 134 home deployments for lead pipes and 18 deployments for
581 meters.
582 Figure 9 below shows the optimal solution of street blocks based on the different scenarios
583 considered, the selection of areas representing the mid-point of the pPareto-efficient frontier is
584 visualized. Table 2 compares the performance between of these three scenarios.
26
585
586
587 FIGURE 9. Street Block block Selections selections—– Ssolution Comparison comparison with
588 Different different Constraintsconstraints.
589
590 TABLE 2. Street Block block Selections selections—- Pperformance Comparison comparison with
591 Different different Constraintsconstraints.
Number of
Number of sStreet Objective 1, Lead Objective 2,
pParcels bBlocks lead sService fFailed mMeter
Scenario sSelected sSelected lLine rRemoval rRemoval
Baseline 886 166 768 562
Geographic 886 111 603 535
Minimumminimum
Geographic Minimums 879 78 603 518
minimums and
Compactnesscompactnes
592
593 One interesting observation is that the outcome in each scenario is at or just under the maximum
594 number of allowable parcels (886), however, the number of street blocks is reduced as more
595 constraints are added. The ‘“baseline’ baseline” model has no geographic constraints and the
596 solution under consideration selects 116 street blocks for inclusion in the capital program, whereas
597 the ‘“Geographic Minimums and Compactness’Compactness” scenario has less than half at just 78
27
598 street blocks. The performance across both objectives also trends in a decreasing manner as more
599 constraints are introduced. The ‘“baseline’ baseline” scenario far outperforms the other two
600 scenarios in terms of lead removal with almost a 27% increase, and marginally improves on failed
601 meter removal with a 5% increase. The reasoning behind these patterns were was discussed earlier
602 in this section, with lead service lines being primarily concentrated in just a few areas of the city,
603 imposing constraints on system-wide project selection will greatly reduce the performance.
604 Combining the trends in street block count and objective performance into consideration, the
605 empirical results suggest there exists a tradeoff between the removal count of target assets (in turn,
606 the effectiveness of a capital program) and cost. The “‘baseline’ baseline” model is the most precise,
607 selecting many smaller street blocks and removing many target assets, but the mobilization and
608 truck roll costs of this program are greatly higher because the small blocks are spread out
609 geographically. In contrast, the ‘“geographic minimums and compactness’compactness” constraint
610 reduces the mobilization cost where all selected projects within the same census tract are closely
611 located. These types of projects can be most efficiently completed with fewer truck rolls, but at the
612 expense of fewer removal of bad assets. These findings aim to highlight the power of optimization
613 modelingmodelling to evaluate different scenarios in the context of program planning, but also the
614 ability of the proposed framework to bundle household replacement projects in a cost-efficient
615 manner.
616 The findings of this selection are summarized below.as follows:
617 The optimal solution at the midpoint of the pPareto-efficient frontier represents an effective balance
618 between the two objectives and does not result in substantial differences in the selected project
619 areas in the single-objective cases. Using the solutions along the middle of the pareto-efficient
620 frontier can thus significantly reduce the number of potential home deployments needed to achieve
621 optimal removal of both lead service line and degraded meter assets.
28
622  The resulting number of lead service lines captured is much more sensitive to tradeoffs
623 when balanced against the need for removing bad meters. This is because lead service
624 lines are mostly clustered together in just a few areas, whereas faulty meters are much
625 more evenly spread out across the system.
626  Including the geographic minimum constraint greatly reduces the performance of both
627 objectives. This is because there are census tracts with low counts of lead pipe and bad
628 meters, but the model is still required to allocate a minimum selection there.
629  The models are less sensitive to the compactness constraint. This is because street
630 blocks with high lead and bad meters are already close to each other within a given
631 neighborhood, so the optimization will naturally select proximate areas without
632 needing a constraint.
633  The empirical results suggest a tradeoff exists between program quality and cost. The
634 baseline model removes the most assets of interest but selects a high number of small
635 street blocks for the program which can increase mobilization costs. In contrast, the
636 model with geographic and compactness constraints selects less than half the street
637 blocks but the removal of target assets is also lower.
638 Possible extensions of the research presented in this paper can involve the application of more
639 sophisticated methods for estimating risk levels of individual asset classes. For example, higher
640 accuracy models for locating degraded meters as well as the incorporation of demographic
641 information to better identify high- risk populations to lead exposure. There are also possible
642 extensions to the optimization model that can be added to reflect other planning considerations,
643 e.g.for example, existence of construction moratoriums in certain neighborhoods, assigning cost
644 functions to address different types of homes based on age and size rather than assuming uniform
645 cost. Additional considerations of equity can also be incorporated into the modelingmodelling to
646 ensure that utility resources are adequately target the most at-need demographics. As more
29
647 constraints and variables are introduced to the integer programming formulation, the tradeoffs
648 between computational tractability and model complexity need to be examined.
649
650 6. CONCLUSION
651 Managing the aging infrastructure of water distribution systems with limited funding means that
652 many operators need to maximize the return on any capital investment. Designing an effective asset
653 management program is, as a result, a resource allocation problem. The objective is to identify
654 areas where vulnerable assets are located and target the deployment of capital dollars to address
655 the problematic areas. The main issue for many municipalities is that there are many classes of
656 infrastructure types (water mains, valves, meters, service lines) that each need renewal, and it is
657 cost ineffective to plan replacement programs only targeting one alone. This is because it can
658 greatly reduce truck roll (or mobilization), defined as the deployment of crew and equipment to a
659 project area, and is a significant cost to any capital improvement program. Therefore, selecting
660 capital improvement projects that target multiple asset classes together can reduce mobilization
661 costs and help utilities stretch their limited budgets further.
662 To our knowledge, no previous work has demonstrated the application of optimization
663 modelingmodelling and geospatial methods for the infrastructure project planning of assets at the
664 individual household resolution. The problem is framed as a dual-objective integer programing
665 model, where the selection of project areas aims to maximize lead service line removal and water
666 meter changeout together. A case study for the joint renewal planning of service lines and water
667 meters is presented. These two infrastructure classes are selected due to the availability of data. To
668 best reduce mobilization cost in practice, projects with similar outage times and required
669 crew/equipment are best suited for joint work (i.e., meter replacements and service line material
670 inspections, service line and water main renewals). To provide additional efficiency and delineate
30
671 individual project areas, we group adjacent land parcels together into larger street blocks. The
672 selection of street blocks is more efficient than selecting individual homes since it allows more
673 geographically compact replacements projects to be performed.
674 Since a multi-objective optimization model is specified, the optimal solution to the problem is
675 represented via a pPareto-efficient frontier. Each point along the frontier represents a unique
676 selection of street blocks to be included in a potential asset management program, and the different
677 solutions represent different weightings of the two objectives. Empirical data suggests the multi-
678 objective approach can identify effective project selections that also significantly reduce duplicate
679 visits to the same home. Furthermore, the sensitivity of a given objective to tradeoffs is highly
680 dependent on the spatial distribution of the target assets. In our case study, lead service lines are
681 spatially concentrated in only a limited number of areas, and as a result, is are more sensitive to
682 performance decreases when balanced with the competing objective to remove bad meters. The
683 spatial concentration of lead pipes also means that any constraints to enforce broad spatial selection
684 of projects, potentially due to political concerns, will greatly reduce the removal performance.
685 Altogether, our research demonstrates the potential of using dual-objective modelingmodelling to
686 guide replacement planning of household-level water distribution assets and has the potential to
687 generate more cost-efficient programs to protect critical water infrastructures. In practice, the
688 application of the methods explored here needs to be incorporated within broader infrastructure
689 decision frameworks that account for local regulations and the need for all asset types. Water
690 utilities own various infrastructure types (e.g., distribution main, water tower, pumps, valves) that
691 each have has their own replacement and maintenance needs. The research here focuses specifically
692 on the alignment of household-level projects, but the efficacy of capital programs can be greatly
693 improved when assets of different resolutions are considered together.
694
31
695 For example, additional reduction in project mobilization cost can be achieved if service line
696 replacements (household level) were aligned with adjacent distribution main renewals (street level)
697 since both require excavation. Larger planning frameworks are needed to accurately weigh the
698 needs of different infrastructure types. Once the best portfolio of assets to target can be determined,
699 they can then be incorporated into the downstream decision models that selects areas with maximal
700 alignment to reduce cost. Beyond the physical conditions of the infrastructure itself, local
701 regulations that may influence the geographic location of projects also need to be included in the
702 decision framework. Common examples include project moratoriums to avoid frequent disruption
703 of the same neighborhood, incentivized alignments with other departments in their project areas
704 (i.e., water department main replacements combined with transportation department street
705 resurfacing of the same road), and renewal targets specified by the state agencies (i.e., 5%
706 replacement rate of lead service lines per year).
707 In the case of water meter and lead service line replacements, the methods developed here help
708 prioritize neighborhoods for best joint renewal of these assets. However, state regulatory bodies
709 may specify a minimum lead pipe replacement rate due to their urgent public health risk, and local
710 regulation may prevent the excavation of underground pipes within 5 years of a previous project.
711 These factors combined may skew the selected infrastructure projects to prioritize lead pipes over
712 meters, but also influence when certain neighborhoods can be addressed. To our best knowledge,
713 broader planning frameworks like this are typically executed by relying on expert
714 judgmentjudgement, but it is an active area of research in the infrastructure planning domain. The
715 exploration of methods to determine the best mix of asset types to target, and similar optimization
716 methods to best align projects spanning different spatial resolutions, can provide value to utility
717 decision- makers and presents a meaningful direction for related future work.
718
32
719 ACKNOWLEDGMENTSAcknowledgements
720 We would like to thank the City of Dearborn, MI, for agreeing to the use and showcasing of its
721 system data for the implementation of this research. We would also like to thank Mr. Eric Roggow,
722 CMMS Program Manager for the Department of Public Works at the City of Dearborn for his
723 efforts in compiling the relevant datasets needed for this research. The models and risk results
724 presented in this research are hypothetical based on data provided by the city and carry
725 uncertainty, they do not necessarily reflect the true condition of the distribution system assets.
726
727 CONFLICT OF INTEREST STATEMENTConflict of Interest
728 This work was funded by Xylem, Inc., which is developing products related to the research
729 described in this paper. The independence of this work is reviewed and approved in accordance
730 with Xylem Inc.’s 's policy on objectivity in research. The opinions and views expressed are those of
731 the researchers and do not necessarily reflect those of the sponsors.
732
733 DATA AVAILABILITY STATEMENTData Availability Statement
734 The data that support the findings of this study are available from the City of Dearborn, MI. Restrictions
735 apply to the availability of these data, which were used under license for this study. Data are available
736 from the authors with the permission of the City of Dearborn, MI.
737
738 REFERENCESReferences
739 Abernethy, J., Anderson, C., Rauh, A., Schwartz, E., Stroud, J., Tan, X., & Webb, J. (2016). Flint
740 wWater cCrisis : Data-dDriven rRisk aAssessment vVia rResidential wWater
33
741 tTesting. Proceedings of the 23rd ACM SIGKDD iInternational cConference on
742 kKnowledge dDiscovery and dData mMining, 1407–1416.
743 Ana, E. V, & Bauwens, W. (2007). Sewer network asset management decision-support tools: Aa
744 review. International Symposium on New Directions in Urban Water Management,
745 September, 1–8. http://www2.gtz.de/Dokumente/oe44/ecosan/en-sewer-network-
746 decision-making-tool-2007.pdf.
747 Berardi, L., Giustolisi, O., Savic, D. A., & Kapelan, Z. (2009). An effective multi-objective approach
748 to prioritisation of sewer pipe inspection. Water Science and Technology, 60(4),
749 841–850. https://doi.org/10.2166/wst.2009.432
750 Berry, J. W., Fleischer, L., Hart, W. E., Phillips, C. A., & Watson, J.-P. (2005). Sensor pPlacement
751 in mMunicipal wWater nNetworks. ASCE Journal of Water Resources Planning and
752 ManagementJ. Water Resour. Plan. Manag., 131(3), 237–243. 10.1061/(ASCE)0733-
753 9496(2005)131%3A3(237)http://link.aip.org/link/?QWR/131/237/1
754 Carey, B. D., & Lueke, J. S. (2013). Optimized holistic municipal right-of-way capital improvement
755 planning. Canadian Journal of Civil Engineering, 40(12), 1244–1251.
756 https://doi.org/10.1139/cjce-2012-0183
757 Chen, T., & Guestrin, C. (2016). XGBoost: A sScalable tTree bBoosting sSystem. Proceedings of the
758 22nd Acm Sigkdd iInternational cConference on kKnowledge dDiscovery and dData
759 mMining, 785–794.
760 Chen, T.Y., Beekman, J. A., David Guikema, S., & Shashaani, S. (2019). Statistical mModeling in
761 aAbsence of sSystem sSpecific dData: Exploratory eEmpirical aAnalysis for
762 pPrediction of wWater Main bBreaks. Journal of Infrastructure Systems, 25(2).
763 https://doi.org/10.1061/(ASCE)IS.1943-555X.0000482
34
764 Chen, T. Y., Man, C., & Daly, C. M. (2021). Optimizing cluster selections for the replacement
765 planning of water distribution systems. AWWA Water Science, 3(4).
766 https://doi.org/10.1002/aws2.1230
767 Chen, T.Y., Riley, C. T., Van Hentenryck, P., & Guikema, S. D. (2020). Optimizing inspection
768 routes in pipeline networks. Reliability Engineering and System Safety, 195.
769 https://doi.org/10.1016/j.ress.2019.106700, 106700
770 Chen, T.Y., Vladeanu, G., Yazdekhasti, S., & Daly, C. M. (2022). Performance eEvaluation of pPipe
771 bBreak mMachine lLearning mModels uUsing dDatasets from mMultiple uUtilities.
772 Journal of Infrastructure Systems, 28(2). https://doi.org/10.1061/(asce)is.1943-
773 555x.0000683
774 Chen, T. Y., Washington, V. N., Aven, T., & Guikema, S. D. (2020). Review and eEvaluation of the
775 J100-10 rRisk and rResilience mManagement sStandard for wWater and
776 wWastewater sSystems. Risk Analysis, 40(3), 608–623.
777 https://doi.org/10.1111/risa.13421
778 Chojnacki, A., Dai, C., Farahi, A., Shi, G., Webb, J., Zhang, D. T., Abernethy, J., & Schwartz, E.
779 (2017). A dData sScience aApproach to uUnderstanding rResidential wWater
780 cContamination in Flint. Proceedings of the 23rd ACM SIGKDD iInternational
781 cConference on kKnowledge dDiscovery and dData mMining, 1407–1416.
782 https://doi.org/10.1145/3097983.3098078
783 Dandy, G. C., & Engelhardt, M. O. (2006). Multi-oObjective tTrade-oOffs between cCost and
784 rReliability in the rReplacement of wWater mMains. Journal of Water Resources
785 Planning and Management, 132(2), 79–88. https://doi.org/10.1061/(ASCE)0733-
786 9496(2006)132:2(79)
787 de Monsabert, S., Ong, C., & Thornton, P. (1999). An iInteger pProgram for oOptimizing sSanitary
35
788 sSewer rRehabilitation oOver a pPlanning hHorizon. Water Environment Research,
789 71(7), 1292–1297. https://doi.org/10.2175/106143096x122429
790 Dridi, L., Mailhot, A., Parizeau, M., & Villeneuve, J. P. (2009). Multiobjective aApproach for pPipe
791 rReplacement bBased on Bayesian iInference of bBreak mModel pParameters.
792 Journal of Water Resources Planning and Management-ASCE, 135(5), 344–354.
793 https://doi.org/10.1061/(ASCE)0733-9496(2009)135:5(344)
794 Fontanazza, C. M., Notaro, V., Puleo, V., & Freni, G. (2015). The apparent losses due to metering
795 errors: Aa proactive approach to predict losses and schedule maintenance. Urban
796 Water Journal, 12(3), 229–239. https://doi.org/10.1080/1573062X.2014.882363
797 Ganjidoost, A., Vladeanu, G., & Daly, C. M. (2022). Leveraging risk and data analytics for
798 sustainable management of buried water infrastructure. AWWA Water Science,
799 4(2). https://doi.org/10.1002/aws2.1283
800 Haimes, Y. Y., Lasdon, L. S., & Wismer, D. A. (1971). On a bicriterion formulation of the problems
801 of integrated identification and system optimization. IEEE Transactions on Systems,
802 Man and Cybernetics, SMC-1(3), 296–297. https://ieeexplore-ieee-
803 org.afit.idm.oclc.org/stamp/stamp.jsp?tp=&arnumber=4308298.
804 Hajiseyedjavadi, S., Karimi, H. A., & Blackhurst, M. (2022). Predicting lead water service lateral
805 locations: Geospatial data science in support of municipal programming. Socio-
806 Economic Planning Sciences. https://doi.org/10.1016/j.seps.2022.101277, 82, 101277
807 Kerwin, S., & Adey, B. T. (2020a). Pipes or pumps? The use of cost-benefit analysis in investment
808 decision-making for public water infrastructure. Life-Cycle Civil Engineering:
809 Innovation, Theory and Practice - Proceedings of the 7th International Symposium
810 on Life-Cycle Civil Engineering, IALCCE 2020, 1143–1150.
811 https://doi.org/10.1201/9780429343292-151
36
812 Kerwin, Sean, & Adey, B. T. (2020b). Optimal iIntervention pPlanning: A bBottom-uUp aApproach
813 to rRenewing aAging wWater iInfrastructure. Journal of Water Resources Planning
814 and Management, 146(7). https://doi.org/10.1061/(asce)wr.1943-5452.0001217
815 Kim, J., Baek, C., Jo, D., Kim, E., & Park, M. (2004). Optimal planning model for rehabilitation of
816 water networks. Water Science and Technology, 4(3), 133–148.
817 Kleiner, Y., Nafi, A., & Rajani, B. (2010). Planning renewal of water mains while considering
818 deterioration, economies of scale and adjacent infrastructure. Water Science and
819 Technology: Water Supply, 10(6), 897–906. https://doi.org/10.2166/ws.2010.571
820 Kleiner, Y., & Rajani, B. (2001). Comprehensive rReview of sStructure dDeterioration of wWater
821 mMains: Statistical mModels. Urban Water, 3(3), 151–164.
822 Konstantinou, C., & Stoianov, I. (2020). A comparative study of statistical and machine learning
823 methods to infer causes of pipe breaks in water supply networks. Urban Water
824 Journal, 17(6), 534–548. https://doi.org/10.1080/1573062X.2020.1800758
825 Lawler, E. L., & Wood, D. E. (1966). Branch-and-bound methods: A sSurvey. Operations Research,
826 14(4), 699–719. https://doi.org/10.1098/ROYAL/
827 Lund, J. R. (1988). Metering utility services: Evaluation and maintenance. Water Resources
828 Research, 24(6), 802–816. https://doi.org/10.1029/WR024i006p00802
829 Madrigal, A. C. (2019). How a fFeel-gGood AI sStory wWent wWrong in flint. The Atlantic, 1–14.
830 https://www.theatlantic.com/technology/archive/2019/01/how-machine-learning-
831 found-flints-lead-pipes/578692/?
832 utm_medium=offsite&utm_source=google&utm_campaign=newsstand-technology.
833 Martello, S., & Toth, P. (1990). Knapsack problems: Algorithms and computer implementations.
834 Wiley.. https://doi.org/10.1007/springerreference_5701
37
835 Mohanakrishnan, J., Boyle, C., & Poff, J. G. (2019). Detecting and rResolving aApparent lLoss
836 wWith dData sScience. Journal - American Water Works Association, 111(2), 13–17.
837 https://doi.org/10.1002/awwa.1230
838 Muncie, H. L., Sobal, J., & DeForge, B. (2013). Search mMethodologies: Introductory tTutorials for
839 oOptimization and dDecision sSupport tTechniques. In Journal of fFFamily
840 pPPractice (Second, Vol. 28, Issue 1). Springer. https://doi.org/10.1515/9780823274161-
841 004
842 Muñuzuri, J., Ramos, C., Vázquez, A., & Onieva, L. (2020). Use of discrete choice to calibrate a
843 combined distribution and sewer pipe replacement model. Urban Water Journal,
844 17(2), 100–108. https://doi.org/10.1080/1573062X.2020.1748205
845 Nafi, A., & Kleiner, Y. (2010). Scheduling renewal of water pipes while considering adjacency of
846 infrastructure works and economies of scale. Journal of Water Resources Planning
847 and Management, 136(5), 519–530. https://doi.org/10.1061/(ASCE)WR.1943-
848 5452.0000062
849 Naoum-Sawaya, J., Ghaddar, B., Arandia, E., & Eck, B. (2015). Simulation-optimization
850 approaches for water pump scheduling and pipe replacement problems. European
851 Journal of Operational Research, 246, 293–306.
852 https://doi.org/10.1016/j.ejor.2015.04.028
853 Osman, H., Ammar, M., & El-Said, M. (2017). Optimal scheduling of water network repair crews
854 considering multiple objectives. Journal of Civil Engineering and Management,
855 23(1), 28–36. https://doi.org/10.3846/13923730.2014.948911
856 Pecci, F., Abraham, E., & Stoianov, I. (2015). Mathematical programming methods for pressure
857 management in water distribution systems. Procedia Engineering, 119(1), 937–946.
858 https://doi.org/10.1016/j.proeng.2015.08.974
38
859 Puleo, V., Fontanazza, C. M., Notaro, V., De Marchis, M., La Loggia, G., & Freni, G. (2014).
860 Definition of water meter substitution plans based on a composite indicator.
861 Procedia Engineering, 70, 1369–1377. https://doi.org/10.1016/j.proeng.2014.02.151
862 St. Clair, A. M., & Sinha, S. (2012). State-of-the-technology review on water pipe condition,
863 deterioration and failure rate prediction models! Urban Water Journal, 9(2), 85–
864 112. https://doi.org/10.1080/1573062X.2011.644566
865 The American Water Works Association (AWWA). (2010). J100-10 rRisk and rResilience
866 mManagement of wWater and wWastewater sSystems. Denver, CO.
867 Tscheikner-Gratl, F., Caradot, N., Cherqui, F., Leitão, J. P., Ahmadi, M., Langeveld, J. G., Le Gat,
868 Y., Scholten, L., Roghani, B., Rodríguez, J. P., Lepot, M., Stegeman, B.,
869 Heinrichsen, A., Kropp, I., Kerres, K., Almeida, M. do C., Bach, P. M., Moy de
870 Vitry, M., Sá Marques, A., … Clemens, F. (2019). Sewer asset management–state of
871 the art and research needs. Urban Water Journal, 16(9), 662–675.
872 https://doi.org/10.1080/1573062X.2020.1713382
873 Tscheikner-Gratl, F., Egger, P., Rauch, W., & Kleidorfer, M. (2017). Comparison of multi-criteria
874 decision support methods for integrated rehabilitation prioritization. Water, 9(2).
875 https://doi.org/10.3390/w9020068
876 Tscheikner-Gratl, F., Sitzenfrei, R., Rauch, W., & Kleidorfer, M. (2016). Integrated rehabilitation
877 planning of urban infrastructure systems using a street section priority model.
878 Urban Water Journal, 13(1), 28–40. https://doi.org/10.1080/1573062X.2015.1057174
879 US Census Bureau. (2019). TIGER/lLine with sSelected dDemographic and eEconomic dData.
880 Washington, DC.
881 US Environmental Projection Protection Agency (EPA). (2019). Revised lead and cCopper rRule.
39
882 Washington, DC.
883 Vladeanu, G. J., & Matthews, J. C. (2019). Consequence-of-fFailure mModel for rRisk-bBased
884 aAsset Management management of Wastewater wastewater Pipes pipes Using using
885 AHP. Journal of Pipeline Systems Engineering and Practice, 10(2), 1–12.
886 https://doi.org/10.1061/(asce)ps.1949-1204.0000370
887 Yazdandoost, F., & Izadi, A. (2018). An asset management approach to optimize water meter
888 replacement. In Environmental mMModelling and sSSoftware (Vol. 104, pp. 270–
889 281). https://doi.org/10.1016/j.envsoft.2018.03.015
890 Yee, M. D. (1999). Economic analysis for replacing residential meters. Journal of American Water
891 Works Association, 91(7), 72–77. https://doi.org/10.1002/j.1551-8833.1999.tb08666.x
892
40

Aws2 1336

Uploaded by

Copyright:

Available Formats

Aws2 1336

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Aws2 1336

Uploaded by

Copyright:

Available Formats

1

6 Thomas Ying-Jeh Chen1,* [email protected];

12 8920 MD-108, Columbia, MarylandMD, 21045, USA.

15 870 Market St., San Francisco, CaliforniaCA, 94102, USA.

18 11850 Sears Street, Ste A, Livonia, MichiganMI, 48150, USA.

24 EmailCorresponding author: [email protected]

29 management program is a resource allocation problem. Mobilizations of equipment and crew is a

40 leads to greater negative impact on model performance.

44 dDrinking wWater sSystems

48 Article Impact Statement

62 targeting vulnerable assets most in need first.

70 distribution mains, valves, and water meters.

86 and help utilities do more with their limited budgets.

99 Two main contributions of this research are summarized below.as follows:

104 demonstrated the application of optimization modelingmodelling and geospatial

108 infrastructure found inside the home.

111 classes of problems is a set of pPareto-efficient solutions representing different tradeoffs

113 (multi-asset) project selection of water systems. Providing multiple solutions to

115 based on different organizational and regulatory priorities.

127 2. LITERATURE REVIEW

143 maximize overall reliability.

203 below summarizes the two-step process.

206 FIGURE 1. Project Identification identification and Optimization optimization Workflowworkflow.

208 3.1. Problem sSpecification

219 3.2. Spatial aAggregation of hHouseholds

231 method described here meets the needs of this research.

242 3.3. Neighborhood sSelection oOptimization

253 replacement of lead service lines and degraded meters is defined.

254 3.3.1. Decision vVariables

256  Let i ∈ I be the index of candidate project areas.

258 replacement plan, 0 otherwise.

262 lead if unverified (derived from a statistical model).

269 capital program.

274 distribution area each year.

275 3.3.2. Problem fFormulation

278 max ∑ Li X i (1a)

279 max ∑ M i X i, (1b)

293 project areas that are as spatially compact as possible.

303 distribution area can be used.

306 boundary j required to be included in the replacement plan.

312 required amount.

319 compactness in model (1), we first define a few additional variables.

323 distance between the two centroids.

327 spread of the selected project areas.

328 D ij Y ij ≤ B j , ∀(i , j) ∈ ε j , j ∈ J (1f)

329 Y ij ≤ X i , ∀ (i , j)∈ ε j , j ∈ J (1g)

330 Y ij ≤ X j , ∀(i , j)∈ ε j , j ∈ J (1h)

331 Y ij ≥ X i + X j−1 , ∀(i , j)∈ ε j , j ∈ J (1i)

332 Y ij ∈ {0 , 1} , ∀(i , j) ∈ ε j , j ∈ J (1j)

336 domain of the decision variable.

337 3.3.3. Solution mMethod

343 al., 2013).

354 appended to the pPareto-efficient set.

356 FIGURE 3. Schematic for Epsilonepsilon-Constraint constraint Algorithm algorithm Dualdual-

357 Objective objective Maximization maximization Problemsproblems.

366 4. CASE STUDY

371 4.1. Distribution sSystem dData

383 parcels contained within each boundary.

388 4.2. Lead sService lLine pProbability