Co-optimize d machine-learne d manifold models for large eddy simulation of turbulent combustion

Many modeling approaches in large eddy simulation (LES) of turbulent combustion employ a projection of the thermochemical state onto a low-dimensional manifold within state space to reduce the number of transported variables and hence computational cost. Flamelet-generated manifolds (FGM) is an example of a well-established, physics-based approach, but increasingly, principal component analysis (PCA) is being used as a data-driven method for generating manifold models. For both approaches, the non-linear relationship between the location on the predeﬁned manifold and the outputs of interest, such as reaction rates, can be tabulated or encoded in a neural network. This work proposes a new approach for manifold modeling that extends these existing approaches. A modiﬁed neural network structure simultaneously encodes the deﬁnition of the manifold variables, the nonlinear mapping, and the subﬁlter closure for LES. This allows all three of these aspects of the model to be co-optimized, generating a model from any source of combustion thermochemical state data. The manifold parameterizing variables are constrained to be linear combinations of species, as in FGM and PCA-based models, to aid in inter-pretability and implementation. For LES, subﬁlter variances of the manifold variables are also included as inputs. Two types of a priori analysis are performed to evaluate the new approach. In the ﬁrst, the model is trained on data from one-dimensional premixed ﬂames. In this case, the approach recovers the behavior of ﬂamelet-based manifold approaches, and in fact slightly improves performance by identifying an optimized progress variable. The approach is also applied to data from direct numerical simulations of spherical ignition kernels in isotropic turbulence. For any speciﬁed manifold dimensionality, the new approach provides substantially lower prediction errors than a PCA-based model developed from the same data set. Additionally, the LES formulation of the new approach can provide accurate predictions for ﬁl-tered reaction rates across a variety of ﬁlter widths. © 2022 The Authors. Published by Elsevier Inc. on behalf of The Combustion Institute. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )


Introduction
Simulation of turbulent combustion systems is challenging and computationally demanding because of the complex chemical mechanisms that describe the conversion of fuel and oxidizer to products, as well as the wide range of length scales of the important chemical and turbulent flow processes.One of the major objectives of turbulent combustion research is the development of simulation approaches that overcome these two major challenges to provide computationally affordable but still sufficiently accurate predictions in comparison to direct numerical simulations (DNSs), which resolve all scales and solve for all species in the chemical mechanism.For liquid fuels, the number of species, N S , in the mechanism may be O (100) or more, even after reduction.Though temperature, pressure, and N S − 1 species mass fractions are required to fully specify the local thermochemical state in a combustion system, the realized states in most systems are confined to a low-dimensional manifold within this state space [1] .A common strategy to reduce the cost of turbulent combustion simulations is to solve transport equations for variables that parameterize the manifold, rather than solve for all the individual chemical species.These manifold-based combustion models are often coupled with large eddy simulation (LES), in which spatially filtered equations that resolve large, energy-containing turbulent structures but model the smaller scales are solved [2] .There are three key aspects of manifold-based modeling: variables [3] .Additionally, variables that can be computed using algebraic equations or other quantities with known transport equations may also be used to parameterize the model.2. Determining the mapping, F, between the manifold and the relevant quantities of interest .The quantities of interest, φ n = F (ξ i ) , may include the full set of thermochemical state variables and nonlinear functions of the thermochemical state, such as transport properties and reaction rates, that may be required in a simulation.3. Providing closure for spatially filtered output quantities in LES .
Filtered quantities1 are denoted using a tilde ( ) .Because the function is nonlinear, φ n = fi F (ξ i ) = F ( ξ i ) .Therefore, a model is required to account for the effect of the unresolved variation of ξ i on φ n , sometimes referred to as a "turbulence-chemistry interaction" model.It should be noted that turbulence does not necessarily impact the functional form that encodes the thermochemistry ( F), but does lead to unresolved variation in ξ i and therefore the filtered outputs.The dimension reduction of ξ i can facilitate the development of these closure models, an additional benefit of manifold-based modeling.
Traditionally, physical analysis has been used to inform simplifying assumptions for particular combustion systems that accomplish these three steps.As an example, the broad class of flamelet-based models can be derived based on the core assumption that transport in flames is primarily aligned in the direction of the gradient of a single state-space variable.This variable may be the mixture fraction, Z, for nonpremixed combustion, or the reaction progress variable, C, for premixed combustion.Both of these variables can be defined as linear combinations of species mass fractions, but with different weights that lead to their respective physical interpretations.The flamelet assumption implies that thermochemical states in the turbulent flame can be determined from solutions of transformed one-dimensional equations in terms of the relevant variable (as in the steady flamelet [4] and flamelet/progress variable (FPV) [5] models) or solutions of one-dimensional component problems in physical space (as in flamelet-generated manifolds (FGM) [6] ).In either case, the result is a model based on a low-dimensional manifold parameterized by the relevant physical variable(s), with the nonlinear mapping to the outputs of interest coming from tabulated solutions to the one-dimensional equations.Finally, known statistical distributions can be used to model subfilter variation of the relevant variable for closure through the presumed probability density function (PDF) approach [7] or closure can be provided using filtered one-dimensional solutions [8,9] .Other methods, such as analysis of timescales in thermochemical state space, can also be used to develop manifold-based modeling approaches [10,11] .All physically derived manifold-based models are based on calculations of simplified component problems and thus constrained to systems for which the physical assumptions that led to these calculations apply.They may be extended to relax some of these assumptions and account for additional effects, such as multi-stream mixing [12,13] , multi-modal combustion [14,15] , and heat losses [16] , but each additional effect results in an increase in the manifold dimensionality.
In complex multi-physics systems, it becomes difficult to accomplish all three aspects outlined above based on physical analysis alone.Therefore, the increasing availability of high-quality experimental speciation measurements and DNS data sets from realistic configurations has motivated research into the use of datadriven techniques to generate models that augment or replace physically derived models.Artificial neural networks (ANNs) have been of particular interest, because sufficiently complex networks can approximate any function [17] .Thus far, these effort s have largely focused on tackling the aspects of manifold-based modeling in isolation, rather than holistically, but they have shown significant promise.
Two data-driven approaches that have been applied with success to help with the first aspect are principal component analysis (PCA) and optimal estimator analysis.In PCA, a linear analysis is used to identify a new set of basis vectors, called principal components (PCs), for the state space, corresponding to directions in the state space along which a given set of data has the greatest variance.When the set of PCs is truncated, the truncated set provides the optimal linear reconstruction of the full state.This method has been applied to thermochemical state data from both DNS [3] and experiments [18] , demonstrating how a manifold model based on PCs and the linear reconstruction can be implemented.Challenging modeling issues like accounting for differential diffusion [19] or unmeasured quantitites in experimental data [20] have been addressed.Further progress has mostly focused on applying nonlinear mapping techniques with PCA, discussed more below.Some fully nonlinear alternatives to PCA have also been proposed [21,22] , but the nonlinear manifold variable definitions make model implementation more challenging.In contrast to PCA, the objective of optimal estimator analysis is not to generate new manifold variables, but rather to assess the suitability of a proposed set of manifold variables for modeling a given data set.The optimal estimator is the conditional mean of the data set on the proposed variables, which are deemed suitable when the variance about this conditional mean is minimized.It has been shown that ANNs can be used as a surrogate for the optimal estimator because of their universal function approximation property [23] .Falkenstein et al. successfully used this approach to test several possible combinations of manifold variables for modeling forced ignition in a turbulent environment [24] .
One of the earliest uses of machine learning in turbulent combustion modeling was the application of neural networks to provide the nonlinear mapping in flamelet-based models [25] .Rather than tabulating the results of component problems, these results are used to train a neural network that approximates the function mapping from the manifold to the outputs of interest.The primary advantage of this approach is that the amount of memory required to store the neural network is greatly reduced relative to the amount required to store the tables in traditional methods; it is nearly constant with manifold dimensionality as opposed to scaling exponentially [26] .With the trend toward higher manifold dimensionality in flamelet-based models to account for more complex physics, as well as the trend toward heterogeneous architectures in scientific computing, the past few years have seen an explosion of interest in using ANNs to replace flamelet tabulation [27][28][29][30][31][32] .A similar approach has been applied to replace the linear reconstruction process for PCA, and has become the prefered approach for PCA-based models [33][34][35] : first, a truncated set of PCs is generated using PCA on a set of thermochemical state data, and subsequently a nonlinear regression (e.g., a neural network) is fit to predict the desired outputs based on these components.A variety of nonlinear regression techniques have been employed for this purpose, including Gaussian process regression (GPR) [36,37] and multivariate-adaptive regression splines (MARS) [38][39][40] , but ANNs [20,[41][42][43][44] are increasingly popular for similar reasons as in flamelet modeling.
Finally, several approaches have been proposed that apply machine learning to aid in subfilter closure for manifold models.These include the application of neural networks to replace analytical models for the parameters involved in closure for premixed flamelet-based models, such as the progress variable dissipation rate [45] and the flame surface density [46] .Alternatively, for presumed PDF-based closure, neural networks have been shown to provide better approximations to the true PDF than commonly used models in certain configurations [47] .The work of Yellapantula et al. takes this one step further, using a neural network to directly predict filtered output variables based on the filtered mixture fraction, progress variable, and their subfilter variances and dissipation rates [48] .
The present work proposes a novel approach to integrating machine learning holistically across all aspects of manifoldbased modeling, termed co-optimized machine-learned manifolds (CMLM).This method builds on past approaches that couple flamelet-or PCA-based models to neural networks as well as the subfilter closure from Yellapantula et al. [48] .By using a novel, autoencoder-based neural network structure specifically designed for manifold-based combustion modeling, all three aspects are integrated into a single network.This allows the optimization algorithm used to train this network to simultaneously optimize all three, leading to better overall performance than when the aspects are considered separately.For example, although PCs are the optimal basis for linear reconstruction of the full state, this is not necessarily true when coupled with nonlinear regression; components with an arbitrarily small contribution to the overall variance may be essential for regression [49] .The CMLM network is designed to yield manifold variables that are sparse linear combinations of species, allowing for interpretability and relatively simple transport equations that are analogous to those used for flameletand PCA-based models.It will be shown that, depending on the training data used, the CMLM approach can be interpreted either as a method for optimizing the definition of the progress variable for flamelet models that use neural networks, or as an alternative to PCA that accounts for nonlinearity and comes with built-in subfilter closure.It can also be interpreted as an extension of optimal estimator analysis that automatically selects inputs from a set of candidate variables.
The full details of the CMLM approach and the a priori tests that demonstrate its potential advantages are presented as follows.In Section 2 , additional details are provided on flameletand PCA-based models; then, building off these models, the new neural network structure and training procedures are introduced.Two types of data to be used for the a priori tests are described in Section 3 : simple one-dimensional premixed flame calculations (used to show that the model can reproduce the physics of flamelet-based models) and a DNS of an ignition kernel (for comparison against PCA-based models derived purely from data).Then, in Section 4 , the models learned by the new approach from these data sets are assessed, first considering the fully resolved data and then moving to the prediction of filtered data.

Modeling approach
The primary focus of this work is the introduction of the CMLM approach, but it builds on existing physics-based and data-driven modeling approaches that use neural networks, and accordingly it is evaluated relative to these approaches.Therefore, in this section, the relevant flamelet-and PCA-based modeling approaches are described first, focusing on how they accomplish the three aspects of manifold modeling.Then, the novel approach is introduced with network structures that can co-optimize across either the first two or all three steps.Finally, the software framework that is used to implement all of the modeling approaches is described.

Flamelet-type models
The flamelet-generated manifolds approach [6,50] is one of the most commonly applied manifold-based modeling approaches and is used as a physics-based baseline for comparison in this work, although it would be similar to compare to any of the other conceptually related physics-based manifold models.FGM is based on the assumption that the thermochemical states realized in a turbulent flame are equivalent to those realized in a specified onedimensional flame configuration with similar boundary conditions.Therefore, precomputed solutions to the one-dimensional component problem can be used to generate a reduced-order thermochemical state manifold that is then applied to model thermochemistry in the turbulent flame.In this work, FGM models based on planar premixed flames and reactants-reactants counterflow premixed flames are considered, the latter allowing the models to incorporate the effect of strain on the flame structure.
Definition of manifold variables: For FGM, the primary variable used to parameterize the thermochemical state manifold is the state-space variable that corresponds to the direction of variation in the one-dimensional component problem.In the case of the premixed models considered here, this is the reaction progress variable, defined generally as a sum of combustion product species , where the weights w j can be tuned for each configuration to optimize model performance [51][52][53] .In this work, the definition based on product species [54] .Additional manifold-parameterizing variables come from variation in the boundary conditions of the onedimensional component problem.For example, to model stratified premixed flames, the equivalence ratio of the unburned mixture may be varied, and this can be captured by including the mixture fraction in the parameterization of the manifold.For simplicity, the mixture fraction is defined here based on the mass fraction of nitrogen, [55] .Variations in the unburned temperature or pressure can be accounted for similarly, with density ρ and internal energy e being added to the manifold parameterization. 2 Finally, for the model based on counterflow flames, it is necessary to include a variable that conveys the effect of strain rate for which purpose the mass fractions of CO [50] , OH [56] , or H [57] have been proposed (the progress variable dissipation rate could also be used).Therefore, the full range of manifold variables considered is ξ i = (C, Z, ρ, e, Y CO ) , or subsets thereof, as appropriate.
Nonlinear mapping: The data used to generate the nonlinear mapping for FGM come from the solutions of the one-dimensional component problem calculations.For the present work, the approach of solving physical-space component problems and then reparameterizing using the manifold variables is used.The balance equation for species that is solved is where ρ is the density; j k is the diffusive flux, computed using the mixture-averaged formulation; and ˙ ω k is the species net mass production rate.The velocity u is equal to the premixed flame speed s L for planar flames and is computed based on the similarity solution approach described by Key et al. [58] for strained flames.The equations are solved for a variety of equivalence ratios, pressures, unburnt temperatures, and, for the counterflow configuration, inlet mass flow rates, as described in Section 3.1 .The desired model outputs φ n are obtained from these solutions, and rather than generating the nonlinear mapping φ n = F (ξ i ) by tabulating the solutions and interpolation from the table, the data are used to train an ANN that approximates F (i.e., the ANN is used as a nonlinear multivariate regression technique to approximate F).The nomenclature FGM+ANN is used subsequently to make this implementation decision clear, although it does not have any impact on the theoretical interpretation of the model.The same configuration for the ANN is used for the flamelet-based, PCA-based, and CMLM models to allow direct comparison, and is described in Section 2.4 .
Filtered closure: One approach for closure of φ n is to convolute F (ξ i ) with a presumed subfilter PDF, where the presumed form is often a rescaled marginal beta distribution for C (with mean C and variance C v ar = ‹ C 2 − C 2 ) and conditional delta distributions of the form δ( ξ i − ξ i ) for other parameters.However, with this approach, there is significant uncertainty in the applicability of the beta distribution for premixed flames [59] .To eliminate this as a source of uncertainty for the comparisons between models in this work, the multi-filtered flamelet-generated manifolds (M-FFGM) approach [8] is used instead of the presumed PDF approach.To capture subfilter variation in the turbulent flame, the one-dimensional physical-space solutions are filtered with a Gaussian kernel: where is the characteristic filter width, but, for tabulation, the filtered data are instead parameterized by the subfilter variance C v ar .This can be interpreted as using a presumed PDF shape that is consistent with the combination of filter kernel and onedimensional profile.As for the FGM approach with no subfilter closure, an ANN is trained to approximate G and is used in place of traditional tabulation for the a priori comparisons.

PCA-based models
The objective of PCA-based models is to identify lowdimensional structure in an existing set of thermochemical state data and use it to construct a model that can be applied for further simulations at conditions similar to those of the simulation or experiment used to generate the data.The approach used here follows numerous previous works [3,[18][19][20][33][34][35][36][37][38][39][40][41][42][43][44] , but the key steps are detailed to highlight their relationship with the other modeling approaches examined in this work.The data used for PCA are taken from the calculations described in Section 3 and arranged into an m × n matrix S, where the m rows correspond to Y k vectors at different points in space and time, and the n = N S columns correspond to the different species, each having been standardized to zero mean and unit variance.A similarly structured matrix is also constructed to contain the desired output data.
Definition of Manifold Variables: PCA is a well-established technique for identifying truncated bases that can adequately represent data sets in a wide range of contexts.In the combustion modeling context, the PCs can be used as manifold-parameterizing variables.PCs are obtained by taking the singular value decomposition (SVD) of the S matrix, S = U V T , where U is an m × m unitary matrix, is an m × n diagonal matrix of singular values, and V is an n × n unitary matrix.The columns of V then define principal directions in state space, with the corresponding singular values in the matrix being proportional to the variance in the data set along each principal direction, ordered from greatest to least.A truncated ˆ V matrix with N * ξ < n columns, obtained by eliminating the final columns of V , provides the basis for a lower-dimensional linear manifold.The product = S ˆ V defines the values of the PCs.The approximate full state can be recovered from the reduced representation S ≈ ˆ V T , with ˆ V providing the optimal N * ξ -dimensional basis for this type of linear reconstruction.In this work, the S matrix contains only species data, so the PCs are linear combinations of species mass fractions that parameterize the manifold, and in that regard are similar to the mixture fraction and progress variable.Where appropriate, additional state variables such as e and ρ are used as manifold-parameterizing variables in addition to the species-based PCs.This decision allows for more direct comparison between the different approaches being evaluated.
Nonlinear Mapping: Early PCA-based models used the inversion relationship S ≈ ˆ V T to compute the full state from the PCs, then computed the output variables from the full state.However, it has proved more efficient to use nonlinear regression of the output data against the PCs to determine the functional form φ n = F (ξ i ) [35] .Because of the nonlinearity inherent in combustion thermochemical state data, this reduces the number of components that need to be retained and the dimensionality of the manifold.In this work, ANN-based regression is used in conjunction with manifold variables identified using PCA, and proceeds in exactly the same manner as the use of ANNs to approximate F in the case of FGM+ANN models.This combination is subsequently referred to as the PCA+ANN approach.
Filtered Closure: There are several examples of successful a posteriori calculations utilizing PCA, but it should be noted that providing closure for filtered quantities in PCA-based models is a significant challenge.The suitable PDF forms for use with the presumed PDF approach are not known for the PCs, so these works have used alternate, more complex strategies, such as developing machine learning models for the subfilter PDF [43] , carrying out one-dimensional turbulence (ODT) calculations [40,42] , or simply using fine grids that minimize the impact of subfilter models [36] .For this reason, the present work primarily focuses on comparisons to PCA in the context of fully resolved data.However, a strategy where filtered principal component values and their subfilter variances are computed and used to regress filtered output quantities is briefly considered for comparison to the other modeling approaches.

CMLM approach
The development of the CMLM approach, which builds on ideas developed for PCA+ANN and FGM+ANN models, is the core novelty of this work.The motivation is that, by simultaneously optimizing across all parts of manifold-based combustion modeling, the overall model accuracy can be improved.To do this, the structure of the ANN used for nonlinear mapping is expanded to include definition of the manifold variables through linear combinations of species and a method of accounting for subfilter variation of the manifold variables.Importantly, it has an overall form that is equivalent to a flamelet-based model and very similar to a PCA-based model, meaning that many desirable characteristics in terms of implementation and physical interpretability are retained from those models.For clarity, the approach used for integrating the first two aspects to discover an optimal combination of manifold variables and nonlinear mapping is described first.Then the extension required to model filtered data for LES is introduced.

Manifold discovery from resolved data
Typically, the FGM+ANN and PCA+ANN models involve a simple full-connected ANN that maps from the manifold variables to the outputs of interest.However, conceptually, one could prepend this network with a simple, single-layer network that encodes the computation of the manifold variables from species mass fractions, rather than viewing the computation of the manifold variables and the nonlinear mapping completely separately.The result of doing this is the network structure depicted in Fig. 1 , which forms the basis of the CMLM model.
The inputs to the network (first layer, shown using shades of blue in the figure) include the species mass fractions and, optionally, any additional variables ψ m that may be used to parameterize Fig. 1.Structure of the neural network that combines the definition of manifold variables (shades of orange) from species mass fractions (blue), other inputs (dark blue), and the nonlinear mapping F to the model outputs (green) into a unified manifold discovery process.The structure of the nonlinear mapping, with two species-based manifold variables and two hidden layers (gray), is used throughout most of this work, but can be adjusted as needed.
the manifold (e.g., thermodynamic variables, such as e , ρ, or other parameters, such as the strain rate).The values at nodes in subsequent layers of the network are obtained by taking a weighted sum of the values in the connected nodes from the previous layer and passing the result through an activation function: where x N i indicates the value at the i th node in the Nth layer, σ is the activation function, W N are the parameters of the network corresponding to the weight of each connection between layer N and N + 1 , and b N i are bias parameters.The second layer (shades of orange) has a small number of nodes and is designed such that the values at the nodes correspond to the manifold variables, ξ i , which include both linear combinations of species (light orange) as well as any additional parameters (dark orange).The notation ξ * i is used to distinguish only the manifold parameters that are linear combinations of species when necessary.For this layer, the activation function is the identity function, no biases are used, and the network connectivity is structured such that one set of nodes contains single connections to the ψ m inputs and a second set contains dense connections to the species inputs.The total number of nodes in this manifold-defining layer determines the manifold dimensionality, N ξ = N * ξ + N ψ , where N * ξ is the number of linear combinations of species.Subsequent layers in the network have a standard ANN architecture, being fully connected with nonlinear activation functions.The number of nodes in the output layer (shown in green) corresponds to the total number of variables φ n to be predicted.This may include all or a subset of the species mass fraction inputs (this gives the network an autoencoder-like structure), as well as additional thermochemical variables that are needed for closure in simulations or are desired as outputs, such as reaction rates, transport properties, and temperature.
For the FGM+ANN and PCA+ANN models, the weights in the first layer ( W 1 ) would be fixed by the definitions of the physical manifold variables and PCs, respectively, implying W 1 = ˆ V in the latter case.Indeed, the manifold variables are precomputed so only the nonlinear mapping portion of the network, which includes the second layer and beyond, is typically included when these models are presented.The proposed CMLM approach extends beyond the existing models by explicitly including the linear manifold definition layer during the training process and allowing the weights across all nodes in the network to be optimized during network training, including the definition of the manifold.The loss function L that is minimized during training is where the first term is the mean squared error between the values of the predicted output variables ( p ) and their values in a set of training thermochemical data ( t ), the second term is added to induce sparsity in the manifold definition layer3 , the parameter α determines the relative importance of the terms, and the norms are interpreted as vector norms of the flattened matrices.
Essentially, the optimization algorithm tunes the weights in the first layer that correspond to the linear combinations that define the manifold while simultaneously adjusting weights in the nonlinear mapping portion to minimize output error.As a result, this can be viewed as an unsupervised machine learning process for dimension reduction of the thermochemical state space.The optimized manifold variables are sparse linear combinations of species for which transport equations can be easily defined, addressing a limitation of previous attempts to use neural networks to identify low-dimensional structure for combustion simulations [21] .Therefore, once the training process is complete, the optimized manifold variables can be transported in a combustion simulation, and the nonlinear mapping portion of the network can be extracted and used to evaluate the output quantities as necessary for the simulation.Though the present work is limited to an a priori analysis and does not involve implementation in a reacting flow solver, the equivalence of the nonlinear mapping network structure with those of the FGM+ANN and PCA+ANN models means that the implementation would be identical.
The CMLM approach can be used to identify low-dimensional manifolds in any set of combustion thermochemical state data, but the interpretation of the model that gets generated varies based on the training data used.If the model is trained on data from simple configurations, like one-dimensional flames, then its application is theoretically equivalent to FGM or similar models based on the same flames, but with optimized manifold variables to minimize error for the ANN predictions relative to traditional param-eterizations (using, e.g., mixture fraction and progress variable).However, like PCA-based models, CMLM can be used to generate models from more complex data sets where physics-based manifold parameterizations are unknown, giving it a broader range of applicability.Thus, CMLM can be considered an alternative to PCA that is designed for use with nonlinear mappings.The relationship with PCA is made explicit by using PCA to define the initial guess for the manifold-defining W 1 weights in the network.Using this initial guess, which defines an optimal manifold for a linear reconstruction, was found to aid in convergence of the optimization algorithm to a definition that minimizes error for a nonlinear reconstruction.In both cases, the key element that differentiates the proposed approach from the models at the base of these interpretations is that, rather than separately defining the manifold and training a neural network for the nonlinear mapping, these steps are performed as a single co-optimization process.Regardless of the interpretation, the necessary and sufficient assumption required to apply the model is that, to an adequate approximation, the thermochemical states in the simulation lie on the same lowdimensional manifold as those in the training configuration.

Model for filtered thermochemical data in LES
For the FGM+ANN approach described in Section 2.1 , the final form of the model for filtered output thermochemical variables is an ANN that computes them based on filtered manifold variables and (a subset of) their subfilter variances, approximating the func- . Additionally, Yellapantula et al. demonstrated that an ANN of equivalent form with ξ i = (Z, C) could be fit to filtered data from a DNS of a stratified turbulent premixed flame to develop a model for filtered reaction rates in that flame [48] .Lapenna et al. showed a similar result for thermodiffusively unstable flames, using an optimal estimator analysis to choose manifold parameters and replacing the variance with the filter width [60] .Therefore, the CMLM approach uses this form for subfilter closure as well.The network structure from Fig. 1 is extended as depicted in Fig. 2 in order to optimize for definitions of ξ * i that allow for minimum error of G( ξ i , ξ * i, v ar ) for predictions of ‹ φ n in spatially filtered thermochemical data used for training.In essence, optimization of the neural network identifies manifold parameterizations that are tuned to not only provide low error predictions of the outputs, but also to allow effective subfilter closure based on the variances of the manifold parameterizing variables.Conceptually, the result is equivalent to basing subfilter closure on filtered one-dimensional flames, as in the M-FFGM or filtered tabulated chemistry for large eddy simulation (F-TACLES) approaches [60] , but it can be applied to more complex data sets, including those requiring additional parameters beyond those typically used for one-dimensional flames.
The CMLM network structure for predicting filtered quantities shown in Fig. 2 contains the structure from Fig. 1 , with the difference that all input and output quantities are now filtered.The network is trained using filtered data to minimize the loss function ( Eq. ( 4) ) for filtered quantities.Because ξ * i are defined as linear combinations of the species mass fractions, the values in the manifold-defining layer (orange nodes) are now equivalent to ξ i .
The extended network structure also includes a section to compute the subfilter variances of the species-based manifold parameters ( ξ * i, v ar ), which are then passed to the nonlinear mapping portion of the network.This computation requires the matrix of filtered species products ( › Y j Y k ) as an additional input for training, which is needed to compute the matrix of filtered manifold variable products: In principal, this allows for computation of the full covariance ma- q .The present work uses only the variances, which are located on the diagonal of the covariance matrix and can be computed by extracting its diagonal, ξ * , where δ ipq takes a value of unity if all indices are equal and zero otherwise.Thus, for manifold variable i , the corresponding variance can be computed (repeated indices do not imply summation in this equation).The process of computing the variances can be encoded into the neural network by applying the same linear manifold definition operation to the filtered species product matrix twice in succession, resulting in a layer of nodes that corresponds to the matrix of filtered manifold variable products.The nodes that correspond to the diagonal of this matrix 4 , as well as the nodes that correspond to ξ * i , connect forward to a set of nodes that represent ξ * i, v ar , colored red in Fig. 2 .The values at these nodes are computed using Eq. ( 6) rather than Eq. ( 3) .The nodes then connect forward to the nonlinear mapping portion of the network, allowing the variance information to be used in the prediction of the filtered outputs.An important aspect of the implementation of this network structure is that the same trainable weights must be used for the linear operator W 1 everywhere that it appears in order for the computed variances to be consistent with the computed manifold variables.
While Fig. 2 depicts using the full set of variances, the possibility of using only a partial set with N v ar < N * ξ of the ξ * i, v ar is also evaluated (out of analogy to the parameterization in the M-FFGM approach that uses C v ar but not Z v ar or the covariance).The assumption that variances need to be included for only a single manifold variable for accurate subfilter closure, or "single flamelet closure," is commonly employed in a range of physically derived manifold models [5,13] .Importantly, the CMLM approach always seeks the definition of the manifold variables that minimizes the prediction error for the filtered outputs, which may vary based on the number of variances (or covariances) included to aid in subfilter closure.While the present work focuses on demonstrating how all three aspects can be combined into a single network structure, an alternative closure strategy would be to couple the resolved CMLM approach ( Section 2.3.1 ) with independent strategies for subfilter closure, for example machine learned subfilter PDFs [47] .Decoupling the closure from the thermochemical state representation eliminates the ability to optimize the manifold variables to improve subfilter closure, but potentially would allow the subfilter closure to be more general rather than limited to the conditions of the filtered training data.

Implementation of models
The modeling approaches considered and how they handle each of the aspects of manifold-based modeling are summarized in Table 1 , with italics indicating the method selected in this work where multiple choices are possible.Training an ANN is a key part of each of these models, and all are implemented in a set of python scripts that utilize well-established machine learning libraries for this purpose.The PyTorch framework [61] was used to define the network structure and its implementation of the Adam algorithm [62] was used optimize the network parameters, or, more colloquially, to train the networks.Additionally, for the PCA+ANN model, the Scikit-learn library [63] was used to implement the PCA part of the model.Fig. 2. Structure of the neural network that combines all three aspects of manifold-based modeling, including subfilter closure.Note that all network inputs (shades of blue) and outputs (green) are filtered quantities, and as a result, so are the manifold variables (shades of orange).The red nodes correspond to the diagonal of the subfilter covariance matrix of ξ i , although the structure can be adapted to include any number of entries of this matrix.See the Supplementary Material for an alternative depiction of this network.Several tests were run to study the effect of the hyperparameters associated with the network structure and training process, as described in detail in the Supplementary Material.These tests resulted in the selection of a plateau scheduling algorithm for varying the learning rate and batch size training parameters.The testing also led to the selection of a nonlinear mapping ANN structure consisting of two hidden layers with 100 nodes, which was used throughout this work, and a value for the regularization parameter α of 0.01.The activation function used at all nonlinear nodes was the leaky rectified linear unit (ReLU) function [64] .A batch normalization operation was also included at each hidden layer to aid in convergence of the training process.Importantly, the structure and training procedure of the nonlinear mapping ANN was kept constant between the FGM+ANN, PCA+ANN, and CMLM approaches.In all cases, networks were trained with 15 different random parameter initializations.Results from the instance with the lowest final loss function value are presented unless otherwise noted.

Description of a priori validation data sets
Although the CMLM approach allows for incorporation of combustion thermochemical state data from any source and combustion regime, the present analysis focuses on two data sources involving premixed combustion for demonstration purposes.The first set is composed of one-dimensional strained or unstrained pre-mixed flames with varied unburned conditions, which is the same data used to define manifolds in FGM.This allows for assessment of the data-based approach in a configuration where there is a known manifold parameterization to compare against.The second data source is DNS of expanding spherical premixed ignition kernels in isotropic turbulence [65] , for a C 11 H 22 surrogate for Jet A fuel with chemical kinetics from Wang et al. [66] .These simulations provide a suitable test to challenge the CMLM approach because they have a relatively simple configuration, but an optimal estimator study for a similar case indicated that there is considerable uncertainty about what the optimal manifold parameterization to predict heat release is [24] .The same methods are used to preprocess, downsample, and split the data into training and validation segments for both types of data.

One-dimensional premixed flame calculations
To show the applicability of the CMLM approach to different modeling problems, separate data sets consisting of planar unstrained premixed flames and strained premixed flames from a reactants-reactants counterflow configuration are considered.Both data sets are generated by solving Eq. ( 1) using the Cantera software package [67] and the skeletal version of the mechanism with mixture-averaged diffusivities.The unstrained calculations are for methane-air flames with DRM19 chemistry [68] , equivalence ratios ( φ) between 0.5 and 1.6, an initial temperature of 300 K and a pressure of 1 atm, resulting in a data set with an intrinsically twodimensional manifold.A total of 89 one-dimensional calculations with ∼250 points each are included, resulting in a total data set size of 22,219 samples.
The conditions for the strained premixed flames mimic the conditions for the ignition kernel DNS to allow for cross-comparison of the models.The 38 species skeletal form of the mechanism is applied and generates profiles that match those used to initialize the DNS field.Calculations include varied initial temperatures ( T 0 ) from 270 to 360 K; pressures ( p) of 0.9, 1.0, and 1.1 atm; and strain rates ( a ) ranging from the negligible strain limit to extinction.The main strained data set considered throughout this work is based on stoichiometric flames, but a similar data set with φ = 0 .7 is considered in Section 4.1.3 .This results in data sets with ∼10 0 0 calculations and ∼30 0,0 0 0 total samples.By this construction, these data lie on a four-dimensional manifold, which can be based on both combinations of species and other thermodynamic state variables, allowing for the full capabilities of the CMLM approach to be tested.Notably, at four dimensions, the cost penalty for traditional tabulation becomes significant [69] , making application of neural networks more appealing.The variations in initial temperature and pressure also make the data set suitable for comparison to the DNS data, which were obtained with a compressible solver and exhibit variation in these quantities.

Ignition Kernel DNS
Three cases from the simulations of expanding ignition kernels in isotropic turbulence performed by Krisman et al. [65] provide data for model training and validation, and the reader is directed to this reference for complete details on the simulations.Briefly, all cases involve initially spherical kernels that are initialized using a one-dimensional premixed flame profile and evolved using chemistry that has been further reduced based on the quasi-steady-state assumption.The kernels expand in a field of decaying isotropic turbulence with an initial turbulent Reynolds number of 2,400.The three cases considered correspond to different combinations of equivalence ratio and initial kernel radius normalized by laminar flame thickness ( r k /δ L ): φ = 1 .0 , r k /δ L = 4 .5 (Case B); φ = 0 .7 , r k /δ L = 6 . 4 (Case C); and φ = 1 .0 , r k /δ L = 6 .4 (Case D).The ignition kernels are quenched by the turbulence in both Case B and Case C, at around τ = 1 and τ = 2 , where τ is time normalized by the eddy turnover time.In contrast, the Case D ignition kernel grows successfully with the integrated heat release increasing throughout the simulation, which concluded at τ = 2 .5 .Case D is the main case used for model training and evaluation, whereas data from Cases B and C are used to assess transferability of the model between conditions.Each simulation was performed on a grid with 832 points spaced by a uniform DNS = 24 μm in each direction, and up to 50 time snapshots were saved, resulting in up to 2 .9 × 10 10 available samples for each case.

Data processing
Prior to model training and evaluation, the data are filtered (for training models that include LES closure), downsampled, and rescaled.The data are filtered with a Gaussian filter kernel ( Eq. ( 2) , or the three-dimensional generalization thereof), as described for providing filtered closure using the M-FFGM method.For the DNS data, the filter widths ( ) used were 0, 1, 2, 3, 5, 7, 10, 14, 20, and 30 times the grid spacing ( DNS ), and for the strained onedimensional flame data, the filter widths were used, as well as / DNS = 50, 70, and 100.
Given the large amount of available training data, downsampling of the data sets was required for efficient training of the ANNs, with the exception of the one-dimensional unstrained flame data.For that case, each one-dimensional calculation was randomly assigned to either the training or the validation set, and all data points were retained.For the other data sets, a k -means stratified sampling (KSS) approach that uses the k -means clustering algorithm to partition the data and then randomly selects samples from each partition is applied.This sampling approach has been variously used for analyzing combustion data [70] , training subfilter models for combustion systems [45,48] , and, most recently, training PCA-based models [71] .In particular, the algorithm presented by Yellapantula et al. [45] is applied to generate training data with 10 clusters and 10,0 0 0 samples from each cluster, for a total of 10 0,0 0 0 samples.For the filtered DNS data, the algorithm is applied independently each filter width; then, the data from all filter widths are pooled, and 20 0,0 0 0 samples are randomly selected.Validation data sets for each case are generated similarly by randomly selecting from the same clusters and contain one third as many points (75%-25% training-validation split).
As is common in the development of PCA-based models, only a subset of species are considered as inputs [41] , which was found here to lead to more robust convergence to interpretable manifold definitions for CMLM.The same inputs are considered for both PCA and CMLM to allow for direct comparisons between the manifold definitions: fuel mass fraction ( and Y OH .Internal energy and density are included as additional inputs ( ψ k ) for the strained one-dimensional flames and DNS cases.The φ m to be predicted include the same species mass fractions, Y CH 2 O , Y HO 2 , T , the local equivalence ratio, and the source terms for the input species.All input and output variables are standardized to zero mean and unit variance prior to training the models [72] , and these same parameters are used to rescale unseen data when the model is validated.

Results and discussion
Several combinations of the data sets described in Section 3 are used to evaluate specific aspects of the CMLM model in relation to PCA-and flamelet-based models.First, the analysis is limited to the raw, fully resolved data sets, with a focus on the model's ability to identify suitable manifold definitions using the network structure depicted in Fig. 1 .The approach is applied individually to both one-dimensional flame data and DNS data, but the ability to combine information from the two types of data to improve transferability is subsequently assessed.After demonstrating this capability, the models are applied to filtered data sets to demonstrate co-optimization of all three aspects of manifold-based modeling, including subfilter closure, using the network structure depicted in Fig. 2 .The various demonstration cases with different combinations of training and validation data that will be described in the rest of this section are summarized in Table 2 .

Manifold discovery for resolved thermochemical data 4.1.1. One-dimensional premixed flame calculations
As an initial proof of concept for the CMLM approach, the data from the one-dimensional unstrained premixed flames with varied equivalence ratio are analyzed first.By construction, these data can be represented using a two-dimensional manifold parameterized by mixture fraction and progress variable, which forms the basis for the FGM model.Therefore, the definitions of the coordinates ξ i for two-dimensional manifolds obtained using PCA and CMLM can be directly compared to the FGM parameterization ( ξ 1 = C and ξ 2 = Z) for assessment, as shown in Fig. 3 .This figure also shows the flame data plotted in terms of ξ i for each model.For these comparisons, ξ i are computed based on the normalized values of species.Therefore, both the sign and magnitude of the numerical

Table 2
Training and validation data combinations for assessing the CMLM model.

Section
Training Data Set(s) Validation Data Set(s) Filtered values are essentially arbitrary; no particular physical significance can be ascribed to comparisons of the observed numerical values between cases.
The manifold-space plot for FGM demonstrates that, as intended, the chosen definitions of the mixture fraction and progress variable do lead to the output quantities, such as temperature, being single-valued functions on the manifold.In contrast, the first two PCs do not provide such a unique mapping.The second PC ( ξ 2 ) does appear to convey similar information as the mixture fraction, as it generally takes higher values for flames with higher equivalence ratios.However, although the first PC ( ξ 1 ) tends to increase across each flame, it does not do so monotonically, resulting in the flame solutions crossing over each other when projected onto the manifold, with the high-temperature burnt gas regions of some flames overlapping partially burnt regions with lower temperatures in other flames.This lack of uniqueness makes prediction error unavoidable for a manifold model based on these coordinates, although this could be resolved by including additional PCs (beyond the true dimensionality of two).This is a limitation of PCA that the CMLM approach seeks to improve upon: because PCs are generated in a way that optimizes linear data reconstruction, they are not guaranteed to provide the optimal parameterization when there is nonlinear structure in the data, and this may require additional transported variables.
Because there is some variability in the solution that is converged to during the network training process based on stochasticity in the network initialization and training, three characteristic results are presented, all of which have similar levels of training and validation error.The sparsity of the learned W i j matrices allows for ready interpretation of the manifold coordinates, which in all cases depend on the mass fractions of N 2 , O 2 , and CO 2 .Reaction progress monotonically increases with the first coordinate, and the second coordinate is equivalent to either mixture fraction or a combination of mixture fraction and reaction progress (essentially, this rotates the distribution of flame data on the manifold, without changing information content).Thus, the CMLM approach has "learned" from scratch something equivalent to FGM, with a progress variable defined based on CO 2 production and O 2 consumption.For each case, across nearly the entire manifoldspace, the trajectories from the one-dimensional flames do not cross and temperature is a unique function of the manifold parameters.These observations indicate that the definitions obtained can be used as the basis of a model.Particularly in the result labeled CMLM-3, there is a small region at high reaction progress on rich flames where some trajectories do overlap, but the output quantities do not vary significantly in this region, so accurate predictions are still possible.This effect is minimized in the result labeled CMLM-1, which is used in the subsequent comparisons.
The accuracies of the neural network models that map from the manifold to the outputs of interest for each approach are compared in Fig. 4 .The predictions for temperature are representative of those for major species outputs, whereas the predictions for ˙ ω CO2 are representative of radical and source term outputs.Because the outputs are essentially unique functions of the manifolds for FGM and CMLM, both the training and validation data can be predicted nearly perfectly ( r 2 near unity), limited only by the function approximation ability of the chosen network structure.The lack of noise in the data and the minimal difference between training and validation indicate that the high accuracy is not due to overfitting.Accuracy is better for temperature, which is a relatively simple function on the manifold, than for source terms, which are more strongly nonlinear.Furthermore, accuracy is better for CMLM, because it in effect optimizes the progress variable to simplify the mapping to the outputs as much as possible (e.g., temperature varies more smoothly on the CMLM manifolds in Fig. 3 than for FGM).Indeed, as shown during the hyperparameter optimization described in the Supplementary Material, although increasing network complexity can improve accuracy for both approaches, for any given network complexity, CMLM is more accurate.Finally, significant error is incurred for PCA due to the non-uniqueness issue, and improvement by increasing network complexity is not possible (although improvement can be obtained by increasing the manifold dimensionality).
Application of the models for the strained premixed flame data set with φ = 1 provides an additional opportunity for comparison to FGM.The variation in initial temperature and pressure of the one-dimensional flames, as well as strain rate, results in a fourdimensional manifold.For all models, e and ρ are used as inputs to account for the varied initial conditions, and two variables defined as linear combinations of species account for strain and reaction progress ( N ψ = N * ξ = 2 ).In a similar manner as for the unstrained case, the manifolds for the three approaches are compared in Fig. 5 , and the predictions are compared in Fig. 6 .The same general trends can be observed.
CMLM reproduces FGM almost exactly, but the coordinate that corresponds to the progress variable relates to O 2 consumption rather than generation of products.The identification of Y CO as the second parameter supports its use over the alternatives that have been proposed for encoding the effect of strain in FGM.Again, for both these approaches, the identified parameters allow for a unique mapping to model outputs, and as a result, the trained neural networks have minimal error.The performance of PCA is significantly improved relative to the unstrained case, but there is still some overlap of flame solutions on the manifold that results in imperfect prediction of temperature near the fully burned state.

Ignition Kernel DNS
Having demonstrated the ability of CMLM to reproduce and even improve upon physics-based FGM models based on onedimensional premixed flames, the ability to develop manifoldbased models where a suitable definition is not known a priori is now assessed using the Case D ignition kernel DNS data.The approach is benchmarked against PCA, which represents the stateof-the-art tool for this purpose.First, the appropriate manifold dimensionality must be determined, so each approach is trained for a range of N ξ .The training and validation error incurred as a function of the number of species-based manifold variables is plotted in Fig. 7 (all include ψ m = [ ρ, e ] in the manifold definition as well).For CMLM, there is a substantial benefit to including two species-based manifold variables, but diminishing returns beyond that, indicating that the data are reasonably well-approximated on a manifold with four total dimensions.Consistent with the observations from the one-dimensional flame data, more dimensions are required to achieve the same levels of accuracy for PCA+ANN.However, because only eight input species are considered, the two types of manifold must encode equivalent information as this number of dimensions is reached.Therefore, although CMLM does not improve the maximum possible accuracy, it does advance the Pareto front, allowing for better accuracy for simpler models, by approximately a factor of 2 for manifolds with 2 to 4 dimensions.Another important observation from Fig. 7 is that, despite the DNS data being effectively "noisy" (not lying exactly on the manifolds), training and validation error remain very close, indicating that the models are not being overfit.
The manifolds with two species-based dimensions are visualized in Fig. 8 .It is worth noting that, in this case, the non-uniqueness of the thermochemical variables in the twodimensional space shown is not necessarily an issue, as it can be accounted for with the two additional inputs ( e and ρ).As for the one-dimensional flame data, the first dimension identified is progress variable-like for both models, and temperature tends to increase along this axis (although OH is a radical, it is well-  correlated-and nearly monotonically increasing-with temperature in the data).The identification of Y N2 , which is nominally constant in the system and varies only due to the effects of differential diffusion, as a significant contributor to the second manifold variable for both PCA and CMLM indicates that differential diffusion may be significant and that accounting for it is necessary for accurate thermochemical modeling.Indeed, the local equivalence ratio is correlated with the second dimension in both cases.
To further evaluate these models, again focusing on N * ξ = 2 , they are applied to a slice of data from an unseen time instance in the Case D simulation.Qualitatively, the physical-space predictions from both models shown in Fig. 9 agree quite well with the DNS, although for PCA+ANN, subtle differences can be discerned for reaction rates.Therefore, scree plots are also shown in Fig. 9 to quantitatively compare the models.For CMLM, there is significantly less scatter about the parity line and a higher r 2 value, indicating sig-  Here and subsequently, the "overall" error metric is computed from the standardized data for all output variables, such that each contributes equally regardless of scale in the raw data.
nificantly better prediction quality.These results demonstrate that the co-optimized manifold definition improves upon the representation of the data provided by PCs, even when the data are not constrained to lie exactly on a particular manifold, as was the case in the previous subsection.

Transfer across multi-fidelity data
The theoretical basis of the FGM approach is that the local thermochemistry within a turbulent flame is similar to that in representative one-dimensional flames.For the ignition kernel configuration, the representative flames are the strained premixed flames that were previously analyzed.Therefore, it is reasonable to assess how the data from these flames might be able to replace or augment the DNS data for training the CMLM approach.
The simplest possibility is for the model trained on the onedimensional flames to be applied for predictions in the DNS case, which is conceptually equivalent to applying an FGM model to simulate the ignition kernel.However, as shown in the first column of Fig. 10 , when projected onto the manifold, some of the Case D DNS data lie outside of the region covered by the onedimensional flame data, even though both have the same thermochemical boundary conditions ( T 0 , p, φ), which indicates that there are thermochemical states in the DNS that are not represented.Indeed, although the temperature can be predicted well using the model from the one-dimensional flames, more sensitive quantities, like reaction source terms, are not well-predicted.While a posteriori evaluation would be required to determine the true effect, this a priori difference suggests that inaccuracies are possible when applying the model in this way (though the observed differences do not preclude that some quantities and statistics may still be predicted accurately).Given the near equivalence of this model to the relevant FGM model (see Fig. 5 ), any such limitation would also be present for an FGM-based simulation of the ignition kernels.
The CMLM approach gives the flexibility for data from multiple sources to be incorporated when generating the model, provided the data sources use consistent representations of the kinetics and chemical properties, because the manifold does not need to be defined in advance, as is the case for FGM.This capability is demonstrated by attempting predictions for Case D while training the CMLM model on a combination of DNS data with a second equivalence ratio (Case C) and one-dimensional flame data at both the equivalence ratio of interest ( φ = 1 ) and the second equivalence ratio ( φ = 0 .7 ).This larger training data set still does not include training data that matches all Case D states, but the quality of predictions (shown in the last column in Fig. 10 ) for ˙ ω H2O is significantly improved relative to the model trained solely using one-dimensional flame data.A third species-based manifold vari-   able is incorporated to allow this model to account for variation in reaction progress, strain, and equivalence ratio, but the inclusion of the third input does not account for the observed improvement, as no improvement is seen when the model is trained on just the one-dimensional data for both equivalence ratios (second column of Fig. 10 ).Instead, the explanation for the improvement is that the one-dimensional flame data with φ = 1 .0 contain information on the relevant thermochemical boundary conditions, and the combination of DNS and one-dimensional flame data at φ = 0 .7 contains information on the relationship between the two types of data.Therefore, the model trained on the combination identifies a manifold definition that allows for improved predictions for the DNS at the conditions of interest.
The improved predictive capability demonstrated when using multi-fidelity data provides a proof of concept for the use of CMLM as data-augmented FGM.For example, if a parameter sweep over conditions for a particular flame configuration is carried out using FGM-based simulations, and a higher-fidelity simulation is carried out to investigate a single configuration in detail, data from the latter could be incorporated in addition to one-dimensional flame data to train a CMLM model.This could potentially improve the fidelity of the parameter-sweep simulations relative to using FGM.A posteriori testing of this approach is needed to confirm the benefits, but is beyond the scope of the present work.

Closure performance for filtered thermochemical data 4.2.1. One-dimensional premixed flame calculations
To validate the closure approach for filtered thermochemical quantities depicted in Fig. 2 , it is first applied to the filtered onedimensional strained flame data.Here, only flames with T 0 = 300 K and p = 1 atm are included for simplicity.Therefore, the raw flame data lie on a two-dimensional manifold (progress variable and a strain-indicating parameter).For the filtered data, only a single variance is required in addition, as each filtered flame still has a single reference strain rate.This is demonstrated by the FGM+ANN model performance, shown in Fig. 11 .A network that takes the progress variable and Y CO as inputs without any variances incurs significant error when predicting filtered data.Adding the subfilter variance of progress variable as an input significantly reduces error, but further adding the subfilter variance of Y CO has no benefit.The remaining error is due to inaccuracy of the network in approximation of the functional form, rather than dependence on additional unincluded parameters.The CMLM approach recovers similar behavior, although the error incurred when neglecting subfilter variance is smaller because the definitions of the manifold variables are tuned to minimize this error when the model is trained.However, when variances are included, the result is essentially the same as FGM.As for the fully resolved data, the first important coordinate is a progress variable (in this case Y CO2 ), and the second is Y CO , which distinguishes flames with differing levels of strain.
The ability of a PCA-based model that includes the subfilter variances of the PCs to successfully model filtered thermochemical data is also assessed for comparison.However, a two-dimensional manifold cannot do this, even when incorporating two subfilter variances.The reason for this is twofold: (1) even the resolved data do not uniquely lie on the manifold defined by the PCs, as shown in Section 4.1.1, and (2) the PC variances do not provide the optimal inputs for representing the effect of filtering.Notably, adding a second variance improves predictions, but, as seen for the other methods, a single variance can be sufficient to account for the onedimensional filtering if the manifold is suitably defined.
These trends are confirmed by examining the predictions of the three approaches for unseen data at various filter widths, as shown in Fig. 12 .For this comparison, manifolds with two dimensions and a single variance are compared, to match a standard FGM model.Both the FGM+ANN and CMLM approaches accurately predict both T and ˙ ω CO2 across all filter widths.In comparison, there are overpredictions of T and underpredictions of ˙ ω CO2 in the flame zone for the PCA+ANN approach.Although the closure strategy considered here is simple compared to methods that have been applied successfully in a posteriori tests of PCA-based models, these results may still inform some modeling decisions for those strategies.For example, a presumed PDF-based closure, if intended to be applied to similar data across various filter widths and strain rates, would require a PDF form that allows for two PC variances to be independently defined.

Ignition Kernel DNS
The final test considered for the CMLM approach is model generation from the filtered ignition kernel DNS data.Based on the results from the filtered one-dimensional flame data, the comparison to PCA is not undertaken for this test.Neither the dimensionality of the manifold nor the number of subfilter variances that need to be included in the model are known in advance.Therefore, models are trained with up to eight dimensions and all possible numbers Fig. 12.Comparison of true (lines) and predicted (symbols) temperatures and net CO 2 production rates in physical space for a single strained flame at various filter widths that were not included in the training data for the three modeling approaches.The compared models each use a two-dimensional manifold and a single variance.Fig. 13.Validation RMSE as a function of the total number of input dimensions including both species-based manifold variables and their variances for the CMLM approach applied to the filtered Case D ignition kernel with between one and seven manifold variables and between zero and the maximum number of variances.
of variance for each number of dimensions.Initially focusing on data from just Case D, Fig. 13 shows the dependence of modeling error on the model complexity, which is determined by the number of transport equations that must be solved and used as inputs during LES (i.e., N v ar + N * ξ ).The accuracy-complexity Pareto front is largely formed by models with a single variance.Therefore, for this homogeneously premixed ignition kernel, it is necessary to have a single variance to account for subfilter variation, but beyond that, more benefit is obtained by increasing the manifold dimensionality to potentially account for more complex physics (as opposed to improving the representation of subfilter variation by adding more variances).This is also consistent with the broad application of "single-flamelet closure" strategies in physics-based models.An implication is that, for cases similar to these ignition kernels, it may be more beneficial to target physics-based modeling efforts at incorporating more physical phenomena in the generation of the manifold rather than developing presumed PDF models that further decompose the PDF to allow more variances to be independently specified.
As for the unfiltered data, the benefits of increasing complexity are diminishing, and the largest gain is seen when going from one to two linear combinations of species.Thus, the model with a two-dimensional manifold and a single variance is selected for further The manifold learned for this case is visualized in 14 and is to alternative trained on the filtered where wither zero or two variances are used.With either one or two variances, the models compare quite closely to the manifold learned on the unfiltered data ( Fig. 8 ).Slight differences indicate the effect of adding the objective of subfilter closure to the optimization process, with the definition of the manifold variables being tuned slightly to allow their variances to encode the effect of filtering.With no variances, the effect of filtering must be encoded into the definition of the manifold, resulting in a substantially different definition.Notably, the variable for which the subfilter variance is included is predominantly defined based on Y OH , something that is maintained across N * ξ when just a single variance is included.This makes sense given the correlation between Y OH and temperature in the data that was noted previously, meaning that it can be interpreted as a progress variable.
The CMLM model performance can be evaluated qualitatively based on comparisons of T and ˙ ω H2O on slices through the ignition kernel, shown in Fig. 15 for a smaller filter width ( / DNS = 4 ) and a larger filter width ( / DNS = 17 ), both of which were not included in the training data.At these filter widths, the predictions for both quantities are visually indistinguishable from the filtered DNS, indicating that the level of accuracy afforded by a man-ifold with two linear combinations of species and single variance is likely sufficient in this case.The comparison in Fig. 15 is for a time instance that was withheld from the Case D training data, but to better assess the performance of the model on novel data, it is applied to a snapshot from Case B, which has the same equivalence ratio and Reynolds number as Case D but a smaller initial kernel that results in failed ignition.The objective of this comparison is to determine whether transferability between cases with similar thermochemical conditions is possible without combining multi-fidelity data as in Section 4.1.3 .The predictions for Case B are shown in Fig. 16 .Again, there is essentially no visible difference between the predictions and the DNS, demonstrating the ability of the model to transfer between cases with similar conditions.These qualitative observations are verified quantitatively, first by examining the dependence of the prediction error on filter width when applying the model to both Case D and Case B filtered data ( Fig. 17 ).In addition, predictions for a model trained and evaluated on Case B data provide a baseline for assessing the transferability.The first key observation is that the error magnitudes are quite low.The overall RMSEs are on the order of 0.02, indicating that typical errors are on the order of 2% of the standard deviation for each quantity.In dimensional terms, typical errors for T and ˙ ω H2O are both orders of magnitude smaller than the peak values that are observed for these quantities.RMSEs are low across all filter widths, but tend to decrease as the filter width increases for source terms.As shown in Figs. 15 and 16 , the peak values of source terms decrease as filter widths increase, explaining the lower RMSEs.This effect is less pronounced for temperature.Finally, it is observed that the model trained on Case D performs almost as well on Case B data 5 as the model trained on that data set, indicating good transferability between the cases.
A similar comparison is made to validate the predictions of the model as the ignition kernels evolve over time ( Fig. 18 ).For the Case D data, the overall error is roughly constant, other than a slight initial increase as the laminar flame profile initialization adapts to the influence of the turbulent flow field.In contrast, the overall error for Case B decreases over time, which is driven by a decrease in the prediction error for the source terms that occurs as the kernel is quenched and chemical reactions cease.
Overall, the results shown in Figs. 15 to 18 demonstrate that the CMLM approach can not only identify a model that includes a manifold parameterization, nonlinear mapping, and subfilter closure from the DNS data, but also that this model has some ability to transfer between cases with differing conditions and final outcomes.A physical justification for this ability to transfer is that both cases are based on data for the same thermochemical conditions ( φ, T 0 , p) and Reynolds number, meaning that the data should lie on similar low-dimensional manifolds, although the distribution of points between quenched and reacting states is very different for the two cases.The presence of locally extinguished pockets in the Case D data provides training data at the same thermochemical states that are prevalent in the globally quenched Case B data.

Discussion of computational cost
The relative computational cost of the modeling approaches compared in this work is an important consideration.There are three separate areas where computational cost must be considered: data acquisition, model training, and model evaluation during simulation.Because the nonlinear mapping portion of the network that is used for model evaluation is the same for CMLM,  PCA+ANN, and FGM+ANN, the cost of model evaluation is identical for all approaches for a fixed manifold dimensionality.However, as shown previously, the CMLM approach may require fewer dimensions for the same accuracy as alternatives, which would reduce the cost of simulations by reducing the number of transported quantities.The training cost is also quite similar for the different approaches, although the additional trainable parameters in the manifold definition layer for CMLM does lead to a slight increase in cost.But the networks evaluated in this work train in minutes on a single processor, so training cost is not a major concern.
Comparison of the cost of the generating the training data is more complex.While FGM+ANN models are always based on onedimensional flame calculations, the data-driven models may utilize data from a variety of sources, including both one-dimensional flame calculations and DNS.If these models are trained based only on one-dimensional flames, as in Sections 4.1.1 and 4.2.1 , then the data acquisition costs for all approaches are identical, and typically insignificant relative the cost of LES.When data-driven approaches utilize data from other sources, as DNS, additional cost incurred to obtain the must be considered.In general, when using DNS data, the objective of data-driven models is   to leverage existing data sets where they are available, as the cost to run DNS for the particular purpose of model training would be prohibitive.When using existing data sets, there may be DNS data available for a configuration of interest at a single set of conditions, while it is desired to perform LES over a range of conditions or perturbations of the geometry.For example, the a priori evaluation of CMLM trained for Case D on the Case B data indicates an ability to model the effect of varied kernel size for the ignition kernel prob- The multi-fidelity approach described in Section 4.1.3provides a path to do this for more significant changes in conditions.Finally, an area for future investigation is model training based on intermediate fidelity calculations, for example two-dimensional simulations, which would not have prohibitive data acquisition costs and may more fully represent the relevant thermochemical states than one-dimensional flames.

Conclusions
The CMLM approach is a data-driven method of generating manifold-based models for turbulent combustion that combines dimension reduction, calculation of reaction rates and other thermochemical quantities, and, optionally, subfilter closure for LES into the structure of a single neural network.This allows the three aspects to be co-optimized, building upon already successful manifold-based approaches such as PCA and many physics-based manifold models, which generally consider these aspects separately.However, the same general form is maintained, with the manifold being defined based on linear combinations of species, allowing equivalent equations for the manifold-parameterizing variables to be used.When subfilter closure is considered, the mapping from filtered manifold variables and their subfilter variances to filtered output quantities is learned directly, eliminating the need for developing additional closure models, such as presumed PDFs.Another key aspect of the approach is the tunable sparsification of the linear combinations, which aids in interpretability.
CMLM can be applied to generate models based on any source of thermochemical state data, and has been demonstrated in this work using a priori tests based on data from one-dimensional premixed flames and DNS studies of turbulent premixed ignition kernels.Using the one-dimensional flame data, CMLM generates a model nearly identical to FGM, whereas PCA fails to generate a suitable model when limited to the known true manifold dimensionality for these data sets.In fact, CMLM even slightly improves accuracy relative to FGM by optimizing the definition of the progress variable, and automatically identifies Y CO as a variable to encode the effect of strain.In this limited context, CMLM can be interpreted as a method of optimizing the progress variable and other manifold variables for FGM or other similar physics-based methods.A greater capability of CMLM is generation of data-driven models from complex data without a known physics-based manifold model.In this context, the co-optimization of the manifold definition and the nonlinear mapping advances the complexityaccuracy Pareto front relative to the PCA baseline, so CMLM may be able to reduce the cost or further improve accuracy relative to this approach that has already shown promise across many combustion modeling studies.The simple subfilter closure strategy was also demonstrated to yield accurate predictions across filter widths.The accuracy of the representation for resolved data shown in Section 4.1 also indicates that CMLM accurately encodes the thermochemistry, so it could be coupled to alternative subfilter closure models if desired.
Although CMLM, like all manifold-based approaches, is intrinsically limited to the combustion regime and the general thermochemical conditions of the data used to generate the manifold, good transferability was shown between cases where differing initial kernel sizes at the same Reynolds number and thermochemical conditions led to different global outcomes (sustained vs. quenched combustion) and therefore datasets that emphasize different regions of thermochemical state space.Finally, it was demonstrated that "data-augmented" models may be generated by combining DNS data with one-dimensional flames.This strategy improves predictions for thermochemical conditions outside the scope of the available DNS data, if relevant one-dimensional flames can be computed.This work also has several implications for physics-based modeling, including supporting use of Y CO as a strain-encoding parameter; demonstrating that in these premixed cases, including only a single subfilter variance is sufficient; and pointing toward the importance of differential diffusion in the cases studied.The test case presented in this work, which include predictions for both fully resolved and filtered data, demonstrate the potential benefits of the approach relative to existing methods; the next step for this line of research is confirmation through a posteriori testing.

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Fig. 3 .
Fig. 3. Weights defining the manifold variables (bars) and one-dimensional flame data plotted in manifold space (scatter) for the three modeling approaches applied to the one-dimensional unstrained flame data, with three representative solutions from the CMLM approach.The three CMLM solutions correspond to different initial weights for training the nonlinear mapping portion of the network, but are otherwise identical.Note that the weights are based on the standardized values of the species mass fractions to better indicate the contribution of each species to variation of the ξ i , and the weights have been normalized so that the largest weight for each manifold dimension has unit magnitude.In the manifold-space plots, training data are colored according to true temperature, and validation data are indicated with black points.An animated figure showing how the CMLM manifold definition evolves throughout the training process is available in the Supplementary Material.

Fig. 4 .
Fig. 4. Comparisons of true and predicted values for temperature and net CO 2 production rate in the one-dimensional unstrained flame training data (dark points) and validation data (light points), for the three modeling approaches.The dashed black lines represent parity between the true and predicted values.

Fig. 5 .
Fig. 5. Weights defining the manifold variables (scaled as in Fig. 3 ) and one-dimensional strained flame training data, plotted in manifold space and colored by the true temperature for the three modeling approaches.Only data for T 0 = 300 K and p = 1 atm are plotted.

Fig. 6 .
Fig. 6.Comparisons of true and predicted values for temperature and net CO 2 production rate in the one-dimensional strained flame training data (dark points) and validation data (light points) for the three modeling approaches.The dashed black lines represent parity between the true and predicted values.

Fig.
Fig.Training and validation root mean squared error (RMSE) (solid light and dashed dark lines, respectively) as a function of manifold dimensionality for the PCA+ANN and CMLM modeling approaches applied to Case D of the ignition kernel DNS.Here and subsequently, the "overall" error metric is computed from the standardized data for all output variables, such that each contributes equally regardless of scale in the raw data.

Fig. 8 .
Fig. 8. Weights defining the manifold variables (top, scaled as in Fig. 3 ); training data from the Case D ignition kernel plotted in manifold space and colored by the true temperature (middle) and local equivalence ratio (bottom).

Fig. 9 .
Fig. 9. Application of the trained models to a slice of data from the Case D ignition kernel at a time instance that was withheld from training, with comparisons between the modeled and true (DNS) values of temperature (top row) net H 2 O production rate (bottom row), as well as physical-space comparisons.

Fig. 10 .
Fig. 10.Application of the CMLM approach for different combinations of data sets, including two equivalence ratios ( φ = 1 , indicated by blue points, and φ = 0 .7 , indicated by orange points) for both DNS (dark shades) and one-dimensional flame (light shades).In all cases, the Case D DNS data with φ = 1 is used for validation, but different combinations are used for training, as indicated in the columns.The manifold-space distribution of data are depicted in the top row.The middle and bottom rows show comparisons of predicted values against true values for temperature and net H 2 O production rate, respectively.

Fig. 11 .
Fig. 11.Training and validation root mean squared error (RMSE) (solid light and dashed dark lines, respectively) as a function of the number of included variances for the three modeling approaches, all with two-dimensional manifolds, applied to filtered one-dimensional strained flames.

Fig. 14 .
Fig. 14.Weights defining the manifold variables (scaled as in Fig. 3 ) and training data from the filtered Case D ignition kernel, plotted in manifold space and colored by the true temperature.Results are shown for models with two species-based manifold parameters and zero, one, or two subfilter variances included as indicated by the column labels..

Fig. 15 .
Fig. 15.Physical-space comparisons of true and predicted values for temperature (top row) and net H 2 O production rate (bottom row) for a time instance of the filtered Case D ignition kernel that was withheld from training for two different applied filter widths.

Fig. 16 .
Fig. 16.Physical-space comparisons of true and predicted values for temperature (top row) and net H 2 O production rate (bottom row) for a single time instance of the Case B ignition kernel at two different applied filter widths based on models trained on the filtered Case D ignition kernel.

Fig.
Fig. RMSE for the CMLM approach applied to filtered ignition kernel data with various combinations of training and validation (test) data sets as a function of filter width.

Fig. 18 .
Fig.18.RMSE for the CMLM approach applied to filtered ignition kernel data with various combinations of training and validation (test) data sets as a function of normalized time τ = t/t eddy during the evolution of the ignition kernels.

Table 1
Comparison of Manifold-based Modeling Approaches.