Incorporating Physical Constraints#
Despite the powerful capabilities of diffusion- and flow-based networks for generative modeling that we discussed in the previous sections, there is no direct feedback loop between the network, the observation and the sample at training time. This means there is no direct mechanism to include physics-based constraints such as priors from PDEs. As a consequence, it’s very difficult to produce highly accurate samples based on learning alone: For scientific applications, we often want to make sure the errors go down to any chosen threshold.
In this chapter, we will outline strategies to remedy this shortcoming, and building on the content of previous chapters, the central goal of both methods is to get differentiable simulations back into the training and inference loop. The previous chapters have shown that they’re very capable tools, so the main question is how to best employ them in the context of diffusion modeling.
Note
Below we’ll focus on the inverse problem setting from Introduction to Probabilistic Learning. I.e., we have a system
Guiding Diffusion Models#
Having access to a physical model with a differentiable simulation
Training with physics priors: The hope of incorporating physics-based signals in the form of gradients at training time would be to improve the state of
Inference with physics priors: For scientific applications, classic simulations typically yield control knobs that allow for choosing a level of accuracy. E.g., iterative solvers for linear systems provide iteration counts and residual thresholds, and if a solution is not accurate enough, a user can simply reduce the residual threshold to obtain a more accurate output. In contrast, neural networks typically come without such controls, and even the iteration count of denoising or velocity integration (for flow matching) are bounded in terms of final accuracy. More steps typically reduce noise, and correspondingly the error, but will plateau at a level of accuracy given by the capabilities of the trained model. This is exactly where the gradients of physics solver show promise: they provide an external process that can guide and improve the output of a diffusion model. As we’ll show below, this makes it possible to push the levels of accuracy beyond those of pure learning, and can yield inverse problem solvers that really outperform traditional solvers.
Recall that for denoising, we train a noise estimator
While this approach manages to includes

Physics-Guided Flow Matching#
To reintroduce control signals using simulators into the flow matching algorithm we’ll follow [HT23]. The goal is to transform an existing pretrained flow-based network, as outlined in Introduction to Probabilistic Learning, with a flexible control signal by aggregating the learned flow and control signals into a controlled flow. This is the task of a second neural network, the control network, in order to make sure that the posterior distribution is not negatively affected by the signals from the simulator. This second network is small compared to the pretrained flow network, and freezing the weights of the pretrained network works very well; thus, the refinement for control needs only a fairly small amount of additional parameters and computing resources.
Fig. 28 An overview of the control framework. We will consider a pretrained flow network
The control signals can be based on gradients and a cost function, if the simulator is differentiable, but they can also be learned directly from the simulator output. Below, we’ll show that performance gains due to simulator feedback are substantial and cannot be achieved by training on larger datasets alone. Specifically, we’ll show that flow matching with simulator feedback is competitive with MCMC baselines for a problem from gravitational lensing in terms of accuracy, and it beats them significantly regarding inference time. This indicates that it provides a very attractive tool for practical applications.
Controlled flow
Then, in a second training phase, a control network
First, the control network is much smaller in size than the regular flow network, making up ca.
1-step prediction The conditional flow matching networks
This issue is alleviated by extrapolating
and then performing subsequent operations for control and guidance on
Note that this 1-step prediction is also conceptually related to diffusion sampling using likelihood-guidance. For inference in diffusion models, where sampling is based on the conditional score
The first expression can be estimated using a pretrained diffusion network, whereas the latter is usually intractable, but can be approximated using
Physics-based Controls#
Now we focus on the content of the control signal
Fig. 29 Types of control signals. (a) From a differentiable simulator, and (b) from a learned encoder.#
Gradient-based control signal In the first case, we make use of a differentiable simulator
Given an observation
As this information is passed to a network, the network can freely make use of the current distance to the target (the value of
Learning-based control signal When the simulator is non-differentiable, the second variant of using a learned estimator comes in handy.
To combine the simulator output with the observation
The gradient backpropagation is stopped at the output of the simulator

Additional Considerations#
Stochastic simulators Many Bayesian inference problems have a stochastic simulator. For simplicity, we assume that all stochasticity within such a simulator can be controlled via a variable
when calling the simulator, we draw a random realization of
Time-dependence
If the estimate
In practice,
Theoretical correctness
In the formulation above, the approximation
An Example from Astrophysics#
To demonstrate how these guidance from a physics solver affect the accuracy of samples and the posterior, we show an example from strong gravitational lensing: an inverse problem in astrophysics that is challenging and requires precise posteriors for accurate modeling of observations. In galaxy-scale strong lenses, light from a source galaxy is deflected by the gravitational potential of a galaxy between the source and observer, causing multiple images of the source to be seen. Traditional computational approaches require several minutes to many hours or days to model a single lens system. Therefore, there is an urgent need to reduce the compute and inference with learning-based methods. In this experiment, it’s shown that using flow matching and the control signals with feedback from a simulator gives posterior distributions for lens modeling that are competitive with the posteriors obtained by MCMC-based methods. At the same time, they are much faster at inference.
Fig. 30 Results from flow matching for reconstructing gravitational lenses. Left: flow matching with a differentiable simulator (bottom) clearly outperforms pure flow matching (top). Right: comparisons against classic baselines. The FM+simulator variant is more accurate while being faster.#
The image aboves shows an example reconstruction and the residual errors. While flow matching and the physics-based variant are both very accurate (it’s hard to visually make out differences), the FM version is just on par with classic inverse solvers. The version with the simulator, however, provides a substantial boost in terms of accuracy that is very difficult to achieve even for classic solvers. The quantitative results are shown in the table on the right: the best classic baseline is AIES with an average
At the same time, the performance numbers for modeling time in the right column show that the FM variant clearly outperforms the classic solvers. While the simulator increases inference time compared to only the neural network (10s to 19s), the classic baselines require more than
A summary of the physics-based flow matching is given by the following bullet points:
✅ Pro:
Improved accuracy over purely learned diffusion models
Gives control over residual accuracy
Reduced runtime compared to traditional inverse solvers
❌ Con:
Requires differentiable physical process
Increased computational resources

Score Matching with Differentiable Physics#
So far we have treated the diffusion time of denoising and flow matching as a process that is purely virtual and orthogonal to the time of the physical process to be represented by the forward and inverse problems. This is the most generic viewpoint, and works nicely, as demonstrated above. However, it’s interesting to think about the alternative: merging the two processes, i.e., treating the diffusion process as an inherent component of the physics system.
Fig. 31 The physics process (heat diffusion as an example, left) perturbs and “destroys” the initial state. At inference time (right, Buoyancy flow as an example), the solver is used to compute inverse steps and produce solutions by combining steps along the score and the gradient of the solver.#
The following sections will explain such a combined approach, following the paper “Solving Inverse Physics Problems with Score Matching” [HVT23], which which code is available in this repository.
This approach solves inverse physics problems by leveraging the ideas of score matching. The system’s current state is moved backward in time step by step by combining an approximate inverse physics simulator and a learned correction function. A central insight of this work is that training the learned correction with a single-step loss is equivalent to a score matching objective, while recursively predicting longer parts of the trajectory during training relates to maximum likelihood training of a corresponding probability flow. The resulting inverse solver exhibits good accuracy and temporal stability. In line with diffusion modeling and in contrast to classic learned solvers, it allows for sampling the posterior of the solutions. The method will be called SMDP (for Score Matching with Differentiable Physics) in the following.
Training and Inference with SMDP#
For training, SMDP fits a neural ODE, the probability flow, to the set of perturbed training trajectories. The probability flow is comprised of an approximate reverse physics simulator
Fig. 32 Overview of the score matching training process while incorporating a physics solver
A differentiable solver or a learned surrogate model is employed for
In this equation, the term
is integrated via the Euler-Maruyama method to obtain a solution for the inverse problem.
Setting
Fig. 33 An overview of SMDP at inference time.#
SMDP in Action#
This section shows experiments for the stochastic heat equation:
In this experiment, the forward solver cannot be used to infer
A small ResNet-like architecture is used based on an encoder and decoder part as representation for the score function
Fig. 34 While the ODE trajectories provide smooth solutions with the lowest reconstruction MSE, the SDE solutions synthesize high-frequency content, significantly improving spectral error.
The ``
SMDP and the baselines are evaluated by considering the reconstruction MSE on a test set of
This highlights the role of noise as a source of entropy in the inference process for diffusion models, such as the SDE in SMDP, which is essential for synthesizing small-scale structures. Note that there is a natural tradeoff between both metrics, and the ODE and SDE inference perform best for each of the cases while using an identical set of weights. This heat diffusion example highlights the advantages and properties of treating the physical process as part of the diffusion process. This, of course, extends to other physics. E.g., the SMDP repository additionally shows a case with an inverse Navier-Stokes solve.
Summary of Physics-based Diffusion Models#
Overall, the sections above have explained two methods to incorporate physics-based constraints and models in the form of PDEs into diffusion modeling. Interestingly, the inclusion is largely in line with Introduction to Differentiable Physics, i.e. gradients of the physics solver are a central quantity, and concepts like unrolling play an important role. On the other hand, the probabilistic modeling introduces additional complexity on the training and inference sides. It provides powerful tools and access to distribiutions of solutions (we haven’t even touched follow up applications such as uncertainty quantification above), but this comes at a cost.
As a rule of thumb 👍, diffusion modeling should only be used if the solution is a distribution that is not well represented by the mean of the solutions. If the mean is accetable, “regular” neural networks offer substantial advantages in terms of reduced complexity for training and inference.
However, if the solutions are a distribution 🌦️, diffusion models are powerful tools to work with complex and varied solutions. Given its capabilties, deep learning with diffusion models arguably introduces surprisingly little additional complexity. E.g., training flow matching models is suprisingly robust, can be build on top of deterministic training, and introduces only a mild computational overhead.
To show how the combination of physics solvers and diffusion models turns out in terms of an implementation, the next section shows source code for an SMDP use case.