Zhu 2022

Applied Energy 305 (2022) 117800

Contents lists available at ScienceDirect

Applied Energy
journal homepage: www.elsevier.com/locate/apenergy

Artificial neural network enabled accurate geometrical design and

optimisation of thermoelectric generator
Yuxiao Zhu 1, Daniel W. Newbrook 1, Peng Dai , C.H. Kees de Groot , Ruomeng Huang *
School of Electronics and Computer Science, University of Southampton, Southampton, United Kingdom


• Forward modelling of thermoelectric

generator via artificial neural network.
• Consideration of geometrical parame­
ters, operating conditions and contact
• Prediction of both power density and
efficiency with high accuracy over 98%
• Superior cost-effectiveness in geomet­
rical design and optimisation.


Keywords: The ever-increasing demand for renewable energy and zero carbon dioxide emission have been the driving force
Thermoelectric generator for the development of thermoelectric generators with better power generation performance. Alongside with the
Optimisation effort to discover thermoelectric materials with higher figure-of-merit, the geometrical and structural optimi­
Artificial neural network
sation of thermoelectric generators are also essential for maximized power generation and efficiency. This work
Genetic algorithm
demonstrates for the first time the application of artificial neural network, a deep learning technique, in forward
modelling the maximum power generation and efficiency of a thermoelectric generator and its application in the
generator design and optimisation. After training using a dataset containing 5000 3-D finite element method
based simulations, the artificial neural networks with 5 layers and 400 neurons per layer demonstrate extremely
high prediction accuracy over 98% and are able to operate under both constant temperature difference and heat
flux conditions while taking into account of the contact electrical resistance, surface heat transfer and other
thermoelectric effects. Coupling with genetic algorithm, the trained artificial neural networks can optimise the
leg height, leg width, fill factor and interconnect height of the thermoelectric generator for different operating
and contact resistance conditions. With almost identical optimised values obtained, our neural networks can
realise geometrical optimisation within 40 s for each operating condition, which is averagely over 1,000 times
faster than the optimisation performed by finite element method. The up-front computational time for the neural

network can be recovered when more than 2 optimisations are needed. The successful application of this data-
driven approach in this work clearly represents a new and cost-effective avenue for conducting system level
design and optimisation of thermoelectric generators and other energy harvesting technologies.

extreme environments, waste heat recovery from automobile and in­

dustrial sites, and off-grid power supply [8]. To overcome this limita­
Nomenclature tion, material and geometrical design optimisation have been
researched extensively as the two main approaches to improve the TE
TEG Thermoelectric Generator efficiency. Developing materials with better thermoelectric properties
ZT Dimensionless figure of merit (evaluated by the figure-of-merit, ZT) has been the main driving force in
Qin /A Heat flux density (mW/cm2) the TE society over the past decade [9,10]. Several material engineering
Th Hot-side temperature (K) strategies such as carrier concentration optimisation, nanostructuring,
Tc Cold-side temperature (K) and band engineering have been proposed and materialised in signifi­
RC Electrical contact resistance (Ω) cantly improved ZT values [11,12]. Materials including SnSe [13,14],
ρC Contact resistivity (Ω⋅m2) PbTe-SrTe [15], and mosaic crystals [16,17] have all been reported to
FF Fill factor have ZT larger than 2, showing encouraging prospects for the large scale
σ Electrical Conductivity (S/m) application of TEGs. With high performing TE materials being devel­
S Seebeck coefficient (µV/K) oped, the vital task shifts to the adequate translation of such high ma­
GA Genetic Algorithm terial properties into the actual performance of the TEG [18]. Despite
ANN Artificial Neural Network the exceptionally high efficiency of ca. 12% reported on a bismuth
HTE Height of the TEG leg (mm) telluride/skutterudite segmented module [19], the research on this
HIC Height of the interconnect (mm) aspect is still lagging. It is rather rare for any TEG to demonstrate high
Wn Width of the n-type leg (mm) efficiency, even when superior TE materials are integrated [12]. The
Wp Width of the p-type leg (mm) main reason is the fact that output power of a TEG relies not only on the
A Surface area (cm2) performance of the TE materials, but also critically on the TEG design
PDmax Power density (mW/cm2) including its geometrical configuration, contact resistance and its
η Efficiency (%) coupling with heat source/sink as well as environmental working con­
k Thermal Conductivity (W/m⋅K) ditions, which demands a comprehensive and holistic consideration in
TEG design and optimisation [18,20].
Considering such complexity in TEG design, dedicated optimisation
methods are preferred over the conventional analytical approach to
perform optimisation. A simplified conjugate-gradient method (SCGM)
was proposed by Liu et al. to realize the parametric optimisation for both
1. Introduction
TEG power and efficiency [21]. The widely used Taguchi method was
also adopted by Chen et al. in TEG system to find the optimum condi­
The global energy consumption has doubled over the past three de­
tions for maximizing the performance [22]. He et al. introduced a Hill-
cades with over 80% of the consumed energy supplied by conventional
climbing algorithm to achieve a maximum power output [23]. More
combustion energy (e.g. coal, natural gas, oil) [1,2]. Producing a secure,
recently, genetic algorithm (GA), a subset of evolutionary computation
sustainable and efficient energy supply that meets the demands of
in artificial intelligence (AI), has also been extensively explored for the
increasing global population and reducing the environmental impact of
application in TEG design. GA is a derivative-free optimisation method
CO2 emissions are widely acknowledged to be among the most impor­
which is an appealing option for solving optimisation problems. It uses
tant societal challenges for the present generation [3]. Staying on the
stochastic and direct-search methods to find good approximate solutions
path to net-zero emission requires immediate and massive deployment
to complex problems with little to no prior knowledge of the optimisa­
of all clean and efficient energy technologies. As depicted in the net-zero
tion problem. Ge et al. employed a non-dominated sorting genetic al­
emission by 2050 scenario (NZE), renewable energy conversion tech­
gorithm (NSGA-II) to identify the best geometric ratio for a segmented
nologies play a central part in emission reduction across all sectors and
TEG [24]. Chen et al. applied the multi-objective genetic algorithm
are account for 90% of all electricity generation [2]. One significant
(MOGA) to determine the optimum leg length and area of thermoelectric
energy source that can be recycled is the waste heat generated during
elements based on a constant volume [25]. A similar algorithm was
the conventional fossil-fuel based power generation process. In fact,
adopted by the same group to maximise the power of a segmented
60% of the fossil energy is wasted in the form of heat. Recovery of just
skutterudite TEG under different temperatures [26].
1% of the wasted energy would provide over 200 TWh of electricity
However, the performance of any of these optimisation methods is
annually (market value ca. $20 billion), and bring very significant
critically dependent on the coupled TEG model to accurately and effi­
associated benefits via reduction in CO2 emission [4].
ciently identify the power output of TEGs. This is particularly chal­
Thermoelectric generators (TEGs), which are capable of harvesting
lenging considering the non-linear thermoelectric effects and the
waste heat and converting this thermal energy into electricity, have the
intricate inter-dependence of each design parameter [27,28]. In general,
potential to contribute significantly to the energy supply by reducing the
TEG models can be established through both theoretical and numerical
inefficiency of current methods and reducing the dependency on fossil
approaches. Table 1 provides a list of reported works based on these two
fuels. Based on the Seebeck effect, TEGs are formed by connecting n-type
approaches. For example, an early theoretical model proposed by Min
semiconductor materials and p-type semiconductor materials electri­
et al. [29] investigated the effect of thermoelement length on the
cally in series and thermally in parallel across a temperature gradient to
module’s coefficient of performance. Gou et al. [30] developed a theo­
allowing current flow between the two legs [5,6]. Compared to other
retical system model for a low-temperature waste heat thermoelectric
energy harvesting technologies, TEGs offer a simple configuration,
generator setup. Newbrook et al. [31] built a simplified theoretical
maintenance-free solid-state operation, and lifetime high reliability that
model for performance optimisation of a thin film based TEG. Although
often significantly exceed those of the devices they power [7]. Despite its
these theoretical models enable quick estimation of the TEG perfor­
great potential, the relatively low energy conversion efficiency has
mance, the accuracy is limited by their grossly simplified assumptions
limited the usage of TEG to applications such as electricity generation in

Table 1 [51,52]. The idea of this data-driven approach is to predict the results
List of a literature review of TEG forward modelling methods. based on approximation without explicitly solving the question. This is
Forward modelling method TEG structure Ref particular useful to model systems that involve a large number of pa­
rameters with complicated relations where analytical approaches are
Theoretical model Bulk [29]
Theoretical model Bulk [30] not readily available. Before the ANNs can perform the intended forward
Theoretical model Thin film [31] modelling, a training process needs to take place in which a dataset is
Numerical model (1-D) Bulk [32] required. This dataset, which normally involves a large number of input
Numerical model (1-D) Bulk [23] and output relations, can be generated by either numerical simulation or
Numerical model (1-D) Bulk (stack) [33]
Numerical model (1-D) Bulk (segmented) [34]
experimental results. However, this is a one-time investment and no
Numerical model (3-D, ANSYS) Bulk [28] significant computation will be needed once the network is properly
Numerical model (3-D, ANSYS) Bulk [25] trained. Despite such advantages, the application of deep learning in the
Numerical model (3-D, ANSYS) Bulk (segmented) [26] forward modelling of TEG has never been reported.
Numerical model (3-D, COMSOL) Bulk (two-stage) [35]
This work reports the first ever deep learning based forward
Numerical model (3-D, COMSOL) Bulk (segmented) [24]
Numerical model (3-D, COMSOL) Bulk (two-stage) [21] modelling of TEG using fully-connected ANNs that demonstrates both
high accuracy and efficiency. The novel neural network can be used to
predict TEG performance under different operating conditions (i.e.
and difficulty to incorporate related thermoelectric effects (e.g. Thom­ constant temperature difference and constant heat flux) without the
son effect) [32]. Numerical model based simulation also prevail due to need of prior knowledge to the thermoelectric device. After training
its superiority in solving differential equations and ease of use [23]. using a dataset generated from 3-D TEG model based on COMSOL
Suter el al. [33] implemented a heat transfer model coupling one- simulations that take into account of the non-linear thermoelectric ef­
dimensional (1-D) conduction through the thermoelement legs to fects, temperature-dependent thermoelectric material properties, elec­
study a thermoelectric stack. Similar 1-D model was also adopted by trical contact resistance, and heat transfer with the ambient
Shen et al. [32] to analyse the TEG performance with temperature- environment, the ANN is able to learn the complex underlying relations
dependence of TE materials considered. Zhu et al. also used a similar in the dataset and perform predictions in an accurate and fast manner. In
model to investigate and optimise the performance of a segmented TEG addition, the application of such ANN in TEG design optimisation is also
[34]. However, most of these self-programmed models are limited to 1-D presented for the first time by coupling it with genetic algorithm. When
and certain TEG structures. Three-dimensional (3-D) modelling tech­ multiple optimisations under different operating conditions are
niques are available in commercial software (e.g. COMSOL and ANA­ required, our ANN-enabled optimisation demonstrates superior cost-
SYS) which enable simultaneous incorporation of all thermoelectric effectiveness comparing against the conventional 3-D modelling
effects and can provide high prediction accuracy for TEG optimisation enabled optimisation, suggesting a significant saving of computational
[28]. For example, a 3-D ANASYS TEG model was coupled with the time and energy.
MOGA in both works report by Chen et al. [25,26], demonstrating very
good agreements with experimental results. Meng et al. [35] build up a 2. Method
TEG model in COMSOL as the direct problem solver to facilitate the
multi-objective optimisation of a thermoelectric energy conversion- 2.1. Physical model and boundary conditions
utilization system. Ge et al. applied a 3-D COMSOL model in their
evolutionary algorithm based optimisation of a segmented TEG [24]. Fig. 1a illustrates the TEG model investigated in this work, which
The simplified conjugate-gradient method proposed by Liu et al. [21] contains a pair of n-type and p-type semiconductors. The thermoelectric
was also coupled with a COMSOL based TEG model. By allowing materials used in this work are Bi2Te2.7Se0.3 for the n-type leg and
simultaneous coupling of nearly all related TE effects, these models have Bi0.5Sb1.5Te3 for the p-type leg. The interconnect and capping materials
superior reliability in calculating TEG power performance. Nevertheless, are copper and quartz glass, respectively. The detailed thermoelectric
such high accuracy for 3-D models comes at a cost of high computational properties such as Seebeck coefficient (S), electrical and thermal con­
demand. For example, tens of thousands of 3-D simulations are normally ductivities (σ and k) and ZT values of both materials are adopted from
required in GA to optimise the design of TEG [25]. Moreover, this past studies [53,54] and presented in Fig. 1b – 1e. The correlations of the
optimisation is only limited to one operating condition (e.g. temperature material properties as a function of temperature are tabulated in
difference or heat flux). When optimisations under various conditions Table S1 in the Supplementary Information.
are required to match different applications, the computational demand The thermal boundary conditions are set to a constant cold-side
can be prohibitive for its wide adoption for TEG optimisation applica­ temperature (TC ) of 300 K, and a convectional heat flux on all open
tions. Modelling method that combines both high prediction accuracy internal surfaces with a heat transfer coefficient of 1 mW/(cm2⋅K) and
and fast speed is therefore key for TEG design and optimisation. external temperature of 293.15 K to include surface heat convection to
Deep learning is a subset of machine learning technology with most air [55]. For electrical boundary conditions, the TEG model is connected
of its models based on artificial neural networks (ANNs). It has received to an external load to form a circuit. The inlet and outlet of the metal
great attention world-wide for its efficiency in analysing a vast number substrate serve as a terminal (variable V) and the ground (V = 0 V) for
of datasets and its revolutionary impact to the field of computer vision the model.
[36,37] and speech recognition [38]. Recently, deep learning has been The operating conditions are of paramount importance for TEG
proposed to replace the conventional intuition based design process in performance. Generally, TEGs can be operated under the condition of
nano-photonics [39,40], providing accurate and efficient design of op­ either constant temperature difference or constant heat flux. Both con­
tical storages [41], metasurfaces [42,43], and nanostructured colour ditions are investigated separately in this work by applying a constant
filters [44,45]. In the energy sector, deep learning has been extensively hot-side temperature (Th ) and constant heat flux density (Qin /A). Elec­
used to model the energy consumption to forecast the energy demand trical contact resistance (RC ) , a crucial factor for TEG [56], was included
[46,47] and electricity consumption [48]. It has also found application in the model by introducing a contact resistivity (ρC ) between the
in solid-state systems to discover and predict the performance of new thermoelectric material and the interconnect interfaces. The effect of
materials due to its outstanding capability of finding optimal solution varying geometrical parameters including the filling factor (FF), height
from enormous data with much lower demand on computational re­ of the TEG leg (HTE ), height of the interconnect (HIC ), and the widths of
sources [49,50]. Several pioneering works have also been reported on the n-type and p-type legs (Wn and Wp ) on the TEG performance are
using machine learning to facilitate research on thermoelectric materials investigated.

Fig. 1. (a) Schematic of the singe-pair thermoelectric generator modelled in this study. Temperature dependent (b) Electrical conductivity, (c) Seebeck coefficient,
(d) thermal conductivity and (e) ZT of the n-type and p-type semiconductors used for the thermoelectric generator. Data generated from [53,54].

2.2. ANN dataset generation by simulating the same parameter set with difference meshs as shown in
Fig. S4. The results showed that the maximum output power obtained
Simulations based on the thermoelectric module in the COMSOL from a “Finer” (6,824 elements) and “Extremely Finer” (60,236 ele­
Multiphysics® software were used in this work to generate dataset for ments) configurations are almost identical with 0.09% difference. Finer
neural network training. This commercial simulation tool was chosen mesh configuration was therefore employed to simulate all parameter
because of its high prediction accuracy and versatility in simulating sets for minimizing computational time while maintaing the accuracy.
different physical TEG models (e.g. segmented, asymmetrical). The The ANN dataset generation was completed after the simulated TEG
simulated device is a single thermocouple shown in Fig. 1a. The details power performance were included. The distributions of the power per­
of the governing equations for this TEG model are provided in the formance in the two datasets can be found in Fig. S1 and Fig. S2.
Supplementary Information. Two datasets concerning the different
operating conditions are generated separately. For each dataset, 5,000
random values were generated uniformly across the range for each
parameter. The resolution of each parameter value is listed in Table 2.
The distribution of each parameter is presented in Fig. S1 and Fig. S2 in
the Supplementary Information.
The obtained 5,000 parameter sets were subsequently simulated in
COMSOL to obtain TEG power performance. Two performance factors,
maximum output power density (PDmax ) and efficiency (η), were
extracted from the simulation. For each parameter set, the electrical
terminal was connected directly to a load resistance and swept from 1/
100 to 100 times the internal resistance. The maximum output power
was then extracted from a parabolic fit of the output power against the
current out as shown in Fig. S3. The efficiency was calculated as the
percentage ratio between the maximum output power and the input heat
The impact of mesh sizes on the simulation accuracy was evaluated

Table 2
Ranges and resolutions of the geometrical parameters and operating conditions
used in this work.
Geometrical Parameter Value Range Resolution

Height of the TE legHTE 0.5 – 5 mm 0.1 mm

Height of the interconnectHIC 0.5 – 3 mm 0.1 mm
Filling factorFF 0.05 – 0.95 0.01
Width of the n-type TE legWn 0.5 – 5 mm 0.1 mm
Width of the p-type TE legWp 0.5 – 5 mm 0.1 mm
Working condition Value Range Resolution
Contact resistivityρC 10-9 – 10-7 Ω⋅m2 10-9 Ω⋅m2 Fig. 2. Architecture of the forward modelling neural network for predicting
Input heat flux densityQin /A 100 – 500 mW/cm2 1 mW/cm2 power performance of the TEG model. The input layer contains geometrical
Hot-side temperatureTh 300 – 500 K 1K parameters (FF, HTE , HIC , Wn , Wp ) and operating conditions (ρC , Qin /A or Th ).
The output layer contains power performance values (PDmax and η).

2.3. ANN configuration and training matching practical scenarios. The optimisation flow chart is illustrated
in Fig. 3. The optimisation is a refined iterative process in which an elite
The configuration of the forward modelling network adopted in this percentage of the geometrical parameter sets (FF, HTE , HIC , Wn , Wp ) are
work is shown in Fig. 2. The network was constructed by fully con­ retained through each iteration, allowing the samples to genetically
necting the input layer of geometrical parameters (FF, HTE , HIC , Wn , Wp ) evolve until the best option has been identified. A population size (i.e.
and operating conditions (ρC , Qin /A or Th ) with output layer of power candidate designs) of 100 was defined. Within each generation, 100
performance (PDmax and η) through several hidden layers. Prior to designs were firstly predicted by the ANN or COMSOL to obtain 100
training, a loss function must be established to allow back propagation. power performance values. These values were compared with each other
The loss function was defined as the mean squared error (MSE) between while the designs with highest power performances were selected into
the predicted power performance and the true power performance (i.e. the next generation. Another 100 candidate designs were subsequently
results from COMSOL simulation). In the training process, the datasets generated based on the best solution obtained in the previous generation
were divided into three sub-datasets for training (4,000), validation with certain mutations and crossovers. In this way, the process is
(500) and testing (500) purposes. The training data were fed to the ANN evolved gradually toward better solutions. Details about GA in this work
to optimise the network by updating the weights and bias of each neuron can be found in Supplementary Information.
through back propagation; validation data were used to examine the
network, serving as a check of the training and an indicator for any 3. Results and discussion
overfitting or under-fitting behavior during the training process; test
data were completely new data to the network and were used to test the 3.1. ANN performance under constant temperature difference (Th )
prediction accuracy of the network after training. All neural network
algorithms were developed via the Python platform using the Pytorch We will first evaluate the prediction performance of ANN for TEG
module. Detailed information for the training process can be found in operating under constant temperature difference. The selection of the
the Supplementary Information. hyperparameters (i.e. number of hidden layers and neurons per layer) is
crucial to the performance of the network. A systematic study was
2.4. Genetic algorithm therefore conducted to investigate the impact of the hyperparameters
for this ANN. Fig. 4a presents the validation loss curves over epochs for
Genetic algorithm (GA) was adopted in this work for geometrical neural networks with the neuron numbers per layer ranging from 20 to
parameter optimisation. Prior to the optimisation, the operation con­ 400 and fixed layer number of 5 (Detailed investigation of layer numbers
ditions (ρC , Qin /A or Th ) can be freely selected to reflect the best can be found in Supplementary Information). The decrements of the loss
curves against epochs indicate the reduction of MSEs through back
propagation. No overfitting was observed for all training processes.
After 2000 epochs, all MSEs are stabilized, signaling the completion of
the training process. It can be observed that an increase of the neuron
numbers from 20 to 400 leads to a lower MSE, suggesting the increasing
complexity could be beneficial to the ANN performance. This needs to be
confirmed by predicting the parameter sets in the test dataset which the
networks have never seen before. Here we define the relative error be­
tween the ANN predicted and true power performance (i.e. performance
obtained from COMSOL simulation) to compare the performance of the
ANN. The relative error can be calculated as:
⃒ ⃒/
Relative error = ⃒Ptrue − Ppredicted ⃒ Ptrue (5)

where Ptrue is the power performance obtained from COMSOL simulation

which include both power PDmax and η. Ppredicted is the power perfor­
mance obtained from ANN. The distribution and average of the relative
error as a function of layer numbers are plotted in Fig. 4b. It is evident
that the average relative error in the test dataset decreases significantly
from 0.044 to 0.019 as neurons per layer increases from 20 to 400. The
prediction accuracy of the ANNs can then be calculated by
Prediction accuracy = (1 − relativeerror) × 100% (6)
A relative error of 0.019 therefore indicates a very high prediction
accuracy over 98%. The distribution of the relative error also suggests
that a majority of the errors are within 3%, indicating an extremely high
prediction accuracy of the network. Therefore, the network of 5 layers
and 400 neurons per layer was adopted for this operating condition.
Fig. 5 plots the comparison between the true (simulated) power
performance of PDmax and efficiency values in the test dataset with the
ones predicted by the ANN. It can be clearly observed that the high
prediction accuracy of our ANN prevails over three orders of magnitude,
producing a high coefficient of determination value (R2) of over 0.999
for both PDmax and efficiency. This outstanding prediction accuracy over
a large range is particular useful for its application in TEG optimisation.
Once the forward TEG model is established, it can be used to
Fig. 3. Optimisation flow chart of the genetic algorithm process. Neural net­
investigate the impact of different parameters on the performance of
works and COMSOL simulation have been used seperately for fitness function
TEG. As an example, the dashed lines in Fig. 6 presents the ANN

Fig. 4. The neural network training for forward modelling TEG power performance under the operation condition of constant temperature difference. The (a)
validation loss curves and (b) the histogram of the probability and average relative errors of the ANNs with different neurons per layer.

Fig. 5. Scatter plot of the ANN predicted and the true (simulated) (a) PDmax and (b) efficiency η under the operating condition of constant temperature difference.

Fig. 6. (a) PDmax and (b) efficiency η obtained from ANN (dashed lines) and COMSOL simulation (dots) as a function of HTE and FF. The operating condition chosen is
Th of 400 K and ρC of 10-8 Ω⋅m2. The HIC , Wn and Wp values were fixed at 1.5 mm, 2.5 mm and 2.5 mm, respectively.

predicted PDmax and η values as a function of HTE and FF while the HIC , optimised HTE for each FF. On the other hand, the efficiency η undergoes
Wn and Wp values were fixed at 1.5 mm, 2.5 mm and 2.5 mm, respec­ different trends with varying HTE and FF as shown in Fig. 6b. Although a
tively. Under the operating condition of Th = 400K and ρC = large FF is still advantageous, its benefit reduces at larger FF values due
10− 8 Ω∙m2 , larger FF is favourable for achieving larger PDmax as shown to the concurrently increased Qin . Unlike PDmax , higher η can be achieved
in Fig. 6a. In our model, changing of FF is achieved by varying the total with larger HTE values when FF is larger than 0.1. COMSOL simulations
area of the model to ensure the Wn amd Wp remain unchanged. A large were also conducted for the same parameter sets (dots in Fig. 6) and
FF implies a small TEG area which leads to reduced interconnect elec­ achieved high consistency with the results generated by ANN as shown
trical resistance and increased PDmax . The dependence of HTE is more in Fig. 6. This suggests that our ANN can be used to investigate TEG
complicated. Small HTE limits the power performance with small tem­ power dependence on different parameters with high accuracy.
perature gradient over the TE legs, while large HTE deteriorates the Once the network is trained and verified, it can be used for design
power by increasing the electrical resistance. This results in an optimisation. Two separate optimisations have been conducted to

Fig. 7. (a) Convergence curve of genetic algorithm for (a) PDmax and (b) η under an operating condition of Th is 400 K and ρC is 10-8 Ω⋅m2. Fig. 12. Comparison of
PDmax and efficiency η obtained from ANN (black line) and COMSOL simulation (blue dots) as a function of (c, d) HTE and (e, f) FF. The optimised values are listed in
the inset table and labelled by the red dots. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

maximize power and efficiency respectively. In both cases, the operating mm and evaluating the performance of the TEG by ANN and simulation
condition of Th = 400K and ρC = 10− 8 Ω∙m2 , were chosen as an as shown in Fig. 7c. On the other hand, maximum efficiency η requires
example. Fig. 7a and Fig. 7b plot the GA convergence curves for power the optimised HTE to reach the upper limit (5 mm) of the pre-set range
and efficiency optimisations, respectively. Both processes converge well (Fig. 7d). The discrepancy of the optimised HTE values can be explained
after ca. 100 generations. A maximum power density of 70 mW/cm2 was by the reducing Qin as HTE increases, leading to higher η but smaller
identified (Fig. 7a) while the maximum efficiency was found to be 3.2% PDmax . In both cases, results obtained from ANN (black lines) are highly
(Fig. 7b). It is clear the designs (shown in the inset) reaching those two consistent with that from simulation (blue dots). The optimisation of FF
optimised values are significantly different. was also investigated. ANN coupled GA has found the largest FF (0.95) in
To verify the effectiveness of our GA optimisation process, parameter the pre-set range for best PDmax . Increasing FF results in larger PDmax
sweeps of both HTE and FF were conducted as shown in Fig. 7c to 7f. (shown in Fig. 7e), which is largely due to the reduction of the TEG
Under a constant Th of 400 K, the optimised HTE is 1.3 mm to achieve the electrical resistance. Similarly, a large FF is also required for high effi­
largest PDmax . This is confirmed by sweeping its value from 0.5 mm to 5 ciency η (shown in Fig. 7f) as the PDmax increment from larger FF

Fig. 8. Optimisation of (a) PDmax and (b) efficiency η by GA coupled with ANN (blue dots) and COMSOL simulation (red dots) as a function of Th , (c) required
optimisation time of both methods for different Th conditions; optimisation of (d) PDmax and (e) efficiency η by GA coupled with ANN and COMSOL as a function of
ρC , (f) required optimisation time of both methods for different ρC conditions. (For interpretation of the references to colour in this figure legend, the reader is
referred to the web version of this article.)

outweighs the increment of Qin . Again, all results generated by ANN

illustrate good matches with that from the simulation, further con­
firming the accuracy of our ANN.
The key advantage of the deep learning aided approach is its
extremely high design efficiency. Here we provide a direct comparison
between the developed ANN and the COMSOL simulation by coupling
both approaches with GA to execute the same optimisation tasks. Fig. 8a
and 8b present the optimised PDmax and efficiency η for different Th
conditions. Both optimised values increase with larger Th . It is evident
that all optimised values from the two approaches are almost identical
with similar geometrical parameters obtained (listed in Table S3).
However, the average time for COMSOL simulation coupled optimisa­
tion was 57,600 s (ca. 16 hrs) while it only took an average of 40 s for
ANN to complete one optimisation. Although ANN requires a one-time
investment for dataset generation (125,106 s, ca. 35 hrs) and network
training (248 s), it is a much more cost-effective way if optimisations
Fig. 10. Scatter plot of the ANN predicted and the true (simulated) maximum
under multiple Th conditions are required. Fig. 8c plots the time required
power density under the operating condition of constant heat flux.
for both methods to perform multiple number of optimisations. It is clear
that the amount of time saved by using ANN easily recovers the up-front
computational time for the network when more than 2 optimisations are
needed. Similarly, in Fig. 8d and 8e, optimisation against different ρC
results in good agreements between the ANN and COMSOL simulation
coupled optimisations. However, the former approach only requires an
average of 35 s while the latter demands 40,000 s (ca. 11 hrs) for each
optimisation. Significant time saving can be achieved if more than 3
optimisations are required as shown in Fig. 8f. In both cases, improve­
ment of computational efficiency of over 1,000 times were obtained.
This superior design efficiency offered by ANN represents not only a
significant saving of computational time but also computational energy.

3.2. ANN performance under constant heat flux

We now turn our focus on the operating condition of constant heat

flux. As the efficiency can be directly converted from power output, only
PDmax will be presented and discussed in this section. The hyper­
Fig. 11. PDmax obtained from ANN (dashed lines) and COMSOL simulation
parameter optimisation was also conducted by varying the neuron
(dots) as a function of HTE and FF. The operating condition chosen is Qin /A of
numbers per layer. As shown in Fig. 9a, the stabilized validation loss is
300 mW/cm2 and ρC of 10-8 Ω⋅m2. The HIC , Wn and Wp values were fixed at 1.5
smallest for network with most neurons per layer of 400. Similar to the
mm, 2.5 mm and 2.5 mm, respectively.
previous condition, relative error was found to decrease from 0.0424 to
0.0177 with neurons per layer increasing from 20 to 400 as shown in
The analytical study under the constant heat flux condition were also
Fig. 9b. This 5-layer and 400 neurons per layer network with a predic­
conducted using this network to investigate the impact of HTE and FF. As
tion accuracy over 98% was adopted for this condition.
Fig. 10 compares the maximum power output PDmax from the ANN an example, the operating condition was chosen to be Qin /A = 300mW/
with the true (COMSOL simulated) values in the test dataset. Again, high cm2 and ρC = 10− 8 Ω∙m2 while the HIC , Wn and Wp values were fixed at
consistency can be observed between the true values and the ANN 1.5 mm, 2.5 mm and 2.5 mm, respectively. Fig. 11 presents the PDmax as
predicted values with a high coefficient of determination value (R2) of a function of HTE and FF. It is clear that PDmax increases with increasing
0.99943, showing great prediction accuracy over the entire power leg length due to larger temperature gradient created, implying an in­
range. crease in η as well. However, the rate of increment decreases at higher

Fig. 9. The neural network training for forward modelling TEG power performance. The (a) validation loss curves and (b) the histogram of the probability and
average relative errors of the ANNs with different neurons per layer.

HTE values due to the adverse impact of larger electrical resistance. On relations, this ANN can “learn” the “knowledge” and generate unlim­
the other hand, smaller FF is preferred to achieve high power perfor­ ited performance predictions with high accuracy. This is beneficial for
mance. A smaller FF implies a larger TEG area which leads to a larger analysing the relations between each parameter and the TEG perfor­
temperature difference and PDmax . In all cases, the simulation results mance. Although this work used a conventional TEG model as a
(dots in Fig. 11) show high consistence with the results generated by demonstration, the ANN can be applied to investigate more complicated
ANN (dashed lines in Fig. 11). TEG structures (e.g. segmented, asymmetrical and multi-stage) as well
After establishing the prediction accuracy of our network, we will as hybrid devices such as solar-TEG where parameter-performance re­
now evaluate the application of the ANN in TEG optimisation by lations are not available. It is also worth pointing out the limitation of
coupling it with GA. Similar operating condition of Qin /A = 300mW/ our network. Even though our ANN has proved to be cost-effective in
cm2 and ρC = 10− 8 Ω∙m2 were chosen as an example for optimisation. design when multiple optimisations are required, the up-front invest­
Fig. 12a shows the convergence curve of GA for PDmax which converges ment on computational resource is still high (i.e. ca. 35 hrs in this work).
after 50 generations. The optimised geometrical parameters are listed in This is mainly due to the time needed to generate the dataset using
the inset table. Sweepings of HTE and FF were subsequently performed to COMSOL. It can be observed from Fig. S1h-I and Fig. S2h-i that the
verify the optimised values. Fig. 12b displays the sweep of HTE . A shorter outputs of the dataset (i.e. TEG power performance) are not uniformly
leg could lead to a beneficially smaller electrical resistance but also an distributed as the inputs. In particular, number of outputs at high and
adversely decreased temperature difference under this operating con­ low ends are much less than those in the middle due to the non-linear
dition. It can be observed that the optimised value of 4.81 mm has been relation between the inputs and outputs. A relatively large dataset (i.e.
correctly identified by our GA. The sweeping of FF is plotted in Fig. 12c. 5,000 in this work) is necessary to ensure high prediction accuracy over
A large FF implies a smaller TEG area which leads to a smaller tem­ the entire TEG performance range (shown in Fig. 5 and Fig. 10). Further
perature difference and maximum power output. On the other hand, a improvements in both the network design and training process are
very small FF could induce large interconnect resistance that also de­ required to reduce the need for such large dataset.
teriorates the power. An optimised FF of 0.11 was identified and was
also verified by sweeping using both ANN and simulation. In addition, 4. Conclusions
COMSOL simulations were also conducted using the same parameter
sets. The simulated results (dots) show great match with the predicted The application of the artificial neural network, a deep learning
results from ANN (line), further confirming the high accuracy of our technique, in forward modelling of the power performance of a ther­
network. moelectric generator has been demonstrated for the first time. After
Efficiency comparison between the developed ANN and the con­ training using a dataset from 3-D COMSOL simulations, the neural
ventional simulation was also conducted under the constant heat flux networks demonstrate extremely high prediction accuracy over 98%
condition. Fig. 13a and 13b present the optimised PDmax under different and are able to operate under both constant temperature difference and
Qin /A and ρC values. As expected, larger Qin /A and smaller ρC can pro­ heat flux conditions while taking into account the electrical contact
duce larger optimised PDmax . Highly consistent optimised values were resistance, surface heat transfer and other thermoelectric effects. It can
obtained from the ANN and COMSOL simulation approaches (detailed be used to replace the conventional theoretical and numerical modelling
list of optimised parameters can be found in Supplementary Informa­ methods to predict and analyse the thermoelectric generator perfor­
tion). The average optimisation time for ANN coupled GA is 40 s while is mance without the need of prior knowledge. Analytical studies using the
60,000 s (ca. 16 hrs) for COMSOL coupled GA, representing a saving of developed networks have been successfully conducted to investigate the
computational time and resources over 1,000 times. This indicates sig­ impact of different parameters to the power performance, and the results
nificant time-savings when more than 2 optimisations are required have shown high consistency with the those generated from COMSOL
(shown in Fig. 13c). simulation. This method is also shown to be extremely efficient and cost-
The successful implementation of ANN for TEG power performance effective in TEG design optimisation when coupling with genetic algo­
prediction in this work has certainly suggested ANN as a powerful tool to rithm. With almost identical optimised values obtained, our neural
assist future investigation and design of thermoelectric related devices. networks demonstrate superior optimisation efficiencies that are aver­
One main advantage of this technology is the needlessness of any prior agely over 1,000 times better than the COMSOL simulation coupled
knowledge to the device as the ANN will “learn” from the dataset. The optimisation. The successful application of artificial neural networks
quality of the dataset is therefore key to the ANN performance. In this reported here clearly points towards the capability of deep learning
work, 3-D COMSOL simulation was used to generate the dataset as it approach to be applied in modelling and optimisation of thermoelectric
takes into account most of the non-linear thermoelectric effects which generators with different structures as well as energy harvesting tech­
are normally ignored in other modelling methods due to complexity. nologies for other renewable energy sources such as solar and wind..
This provided simulation results that were close to real devices. Once
trained using a dataset with limited number of parameter-performance

Fig. 12. (a) Convergence curve of genetic algorithm for maximum power output under an operating condition of Qin /A is 300 mW/cm2 and ρC is 10-8 Ω⋅m2.
Comparison of PDmax obtained from ANN (black line) and COMSOL simulation (blue dots) as a function of (b) HTE and (c) FF. The optimised values are listed in the
inset table and labelled by the red dots. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.)

Fig. 13. PDmax optimised by GA coupled with ANN (blue dots) and COMSOL simulation (red dots) as a function of (a) Qin /A and (b) ρC ; (c) required optimisation time
of both methods for different number of Qin /A conditions. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version
of this article.)

