Scientific Modeling Principles

and Differential Equations

ML for Science - Lecture 4

Understanding why differential equations are the language of physics

Today's Goal

Understand the principles behind scientific modeling:

  • Why do differential equations appear everywhere in physics?
  • What are the inductive biases of scientific modeling?
  • How do simple laws combine into complex equations?
Key insight: These principles will help you understand scientific machine learning—incorporating physical constraints into ML models.

Recap: From Empirical Laws to Models

Lecture 2

Fitting data to known functional forms

  • Hooke's law: $F = kx$
  • Ohm's law: $V = IR$
  • Mapping one variable to another

Today

How do we derive those functional forms?

  • From first principles
  • Conservation laws
  • Functions of space and time

Why Simple Models?

A central theme in both ML and science: generalization

  • Fast inference: Quick predictions are useful
  • Generalize beyond training: Not just a lookup table
  • Interpretable: Understand what the model is doing
The tension: Simple enough to generalize, complex enough to capture reality

Inductive Bias #1: Space and Time

Space and time are special variables in physics.

We're interested in functions that map positions in space and time to measured quantities:

$u: (x, y, z, t) \mapsto \text{measured quantity}$

This is different from empirical laws like $V = IR$, which relate variables at a single instant.

Coordinate Systems and Units

  • Position: Requires a reference point (coordinate system)
  • Time: Requires a clock
  • Units: Meters, seconds, kilograms...
Units are often ignored in ML (we just normalize). But in physics, units encode physical meaning and reveal which effects dominate.

The Only Constant is Change

"Nothing stays the same in space and time. Everything changes."

Yet we search for laws that remain constant.

And these laws describe how things change.

Quantifying Change: From Data

Given measurements $u_i$ at times $t_i$:

$\frac{\Delta u}{\Delta t} = \frac{u_i - u_{i-1}}{t_i - t_{i-1}}$
As $\Delta t \to 0$, slope $\to$ derivative

From Discrete to Continuous

From Data (Discrete)

$\displaystyle \frac{u_i - u_{i-1}}{t_i - t_{i-1}}$

Finite differences between measurements

Idealized (Continuous)

$\displaystyle \frac{\partial u}{\partial t} \; = \; \lim_{\Delta t \to 0} \frac{u(t+\Delta t) - u(t)}{\Delta t}$

Instantaneous rate of change

Key insight: We go discrete → continuous for mathematical convenience, then back to discrete for computation.

Continuous vs. Discrete

Sciences Computer Science
Continuous systems Discrete systems
Smooth functions States and transitions
Derivatives Graphs

These are connected: continuous emerges from taking limits of discrete. Both are idealizations.

Galileo's Discovery

Dropping objects from the Tower of Pisa (allegedly):

  • Position $x(t)$ increases...
  • Velocity $v = \frac{dx}{dt}$ also increases...
  • But acceleration $a = \frac{d^2x}{dt^2}$ is constant!
$g \approx 9.8 \text{ m/s}^2$

The same for all objects, everywhere on Earth (approximately).

Newton's Second Law

$F = ma = m\frac{d^2 x}{dt^2}$

For free fall: $F = mg$, so:

$\frac{d^2 x}{dt^2} = g$

This is a differential equation—an equation involving derivatives.

Solving the Equation

Integrate once:

$\frac{dx}{dt} = gt + v_0$

(velocity)

Integrate twice:

$x = \frac{1}{2}gt^2 + v_0 t + x_0$

(position)

A parabola! From a simple differential equation, we derived the shape of a projectile's trajectory.

Adding Complexity

With air resistance (drag proportional to velocity):

$m\frac{d^2x}{dt^2} = -mg - b\frac{dx}{dt}$
  • First derivative appears (drag term)
  • Solution involves exponentials
  • Pendulums, springs → sine waves

This is what you learn in dynamics courses: finding differential equations from forces.

Inductive Bias #2: Conservation Laws

Conservation laws are everywhere in physics.

The fundamental idea is simple:

Flux in $-$ Flux out $=$ Accumulation

Think of people in a room: if 10 enter and 3 leave, the number increases by 7.

Conservation: Interactive Demo

| |
Flux In: 2/s
Inside: 5
Flux Out: 1/s

The Math: 1D Conservation

Consider a quantity with density $\rho(x,t)$ in a region $[x, x+\Delta x]$:

  • Accumulation: $\frac{\partial \rho}{\partial t} \Delta x$
  • Flux in at $x$: $F(x,t)$
  • Flux out at $x+\Delta x$: $F(x+\Delta x, t)$

Conservation requires:

$F(x) - F(x+\Delta x) = \frac{\partial \rho}{\partial t} \Delta x$

The Continuity Equation

Divide by $\Delta x$ and take the limit $\Delta x \to 0$:

$-\frac{\partial F}{\partial x} = \frac{\partial \rho}{\partial t}$

Or equivalently:

$\frac{\partial \rho}{\partial t} + \frac{\partial F}{\partial x} = 0$

This is the conservation law form or continuity equation.

Examples of Conservation

Conservation of... Flux $F$
Mass $F = \rho v$ (density $\times$ velocity)
Traffic (cars) $F = \rho v(\rho)$ (speed depends on congestion)
Momentum $F = \rho v^2 + p$ (includes pressure)
Energy Heat flux, work done

Higher Dimensions

In 3D with velocity field $\mathbf{v} = (u, v, w)$:

$\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho \mathbf{v}) = 0$

where the divergence is:

$\nabla \cdot (\rho \mathbf{v}) = \frac{\partial(\rho u)}{\partial x} + \frac{\partial(\rho v)}{\partial y} + \frac{\partial(\rho w)}{\partial z}$

This is why we study calculus: to manipulate derivatives and combine them in powerful ways.

Inductive Bias #3: Linearity

Linear systems allow superposition.

A function $L$ is linear if:

$L(aX + bY) = aL(X) + bL(Y)$

Why Linearity Matters

If a system is linear, we can:

  1. Break the problem into simpler parts
  2. Solve each part separately
  3. Add the solutions to get the whole
Instead of understanding the whole (complicated), understand the parts (simple) and combine them.

The Reality: Nonlinearity

Most real systems are nonlinear.

But whenever possible, we try to:

  • Linearize: Approximate near an operating point
  • Find linear subsystems: Isolate what we can
  • Perturb about equilibria: Study small deviations
Nonlinearity creates the hard problems: turbulence, chaos, no closed-form solutions.

Case Study: Navier-Stokes Equations

The equations that describe fluid motion.

They combine:

Newton's Law of Viscosity
(empirical law)
$\tau = \mu \frac{\partial u}{\partial y}$
Conservation of Momentum
(Newton's 2nd law for fluids)
$\frac{D(\rho \mathbf{v})}{Dt} = \text{forces}$

The Equations

For an incompressible Newtonian fluid:

$\rho\left(\underbrace{\frac{\partial \mathbf{u}}{\partial t}}_{\text{local}} + \underbrace{\mathbf{u} \cdot \nabla \mathbf{u}}_{\text{convective}}\right) = \underbrace{-\nabla p}_{\text{pressure}} + \underbrace{\mu \nabla^2 \mathbf{u}}_{\text{viscous}} + \mathbf{f}$

Solve for: velocity field $\mathbf{u}(x,y,z,t)$
Given: pressure $p$, viscosity $\mu$, body forces $\mathbf{f}$

Interpreting the Terms

Term Physical Meaning
$\frac{\partial \mathbf{u}}{\partial t}$ Local acceleration (change at fixed point)
$\mathbf{u} \cdot \nabla \mathbf{u}$ Convective acceleration (nonlinear!)
$-\nabla p$ Pressure forces (compression)
$\mu \nabla^2 \mathbf{u}$ Viscous forces (shearing between layers)

The Lagrangian Derivative

The combination:

$\frac{D\mathbf{u}}{Dt} = \frac{\partial \mathbf{u}}{\partial t} + \mathbf{u} \cdot \nabla \mathbf{u}$

is the material derivative.

  • $\frac{\partial \mathbf{u}}{\partial t}$: change at a fixed point
  • $\frac{D\mathbf{u}}{Dt}$: change following a moving particle

The Nonlinearity Problem

The term $\mathbf{u} \cdot \nabla \mathbf{u}$ is nonlinear in $\mathbf{u}$.

This single term:

  • Makes analytical solutions nearly impossible
  • Prevents proof of existence/uniqueness (Millennium Prize!)
  • Gives rise to turbulence and chaos

Yet we build planes and predict weather by solving these equations numerically.

Why Study Navier-Stokes?

It's an archetypal equation because:

  • Appears everywhere (weather, aerodynamics, blood flow)
  • Contains all PDE challenges (nonlinearity, coupled fields)
  • Full of open questions (basic math properties unproven)
Companies like DeepMind spend millions making Navier-Stokes solvers faster.

Building Block PDEs

Before tackling Navier-Stokes, understand simpler canonical PDEs.

Advection
Transport
Diffusion
Spreading
Wave
Oscillation

The Advection Equation

$\frac{\partial u}{\partial t} + c\frac{\partial u}{\partial x} = 0$

Solution: $u(x,t) = u_0(x - ct)$

The initial profile translates at speed $c$.

Physical: Quantity transported without changing shape (e.g., wave on a string)
t = 0.00

The Diffusion Equation

$\frac{\partial u}{\partial t} = D\frac{\partial^2 u}{\partial x^2}$

Solution: Gaussian spreads as $\sigma(t) = \sqrt{\sigma_0^2 + 2Dt}$

Key insight: Second derivative = curvature
• Local min → $u$ increases
• Local max → $u$ decreases
t = 0.00

The Wave Equation

$\frac{\partial^2 u}{\partial t^2} = v^2 \frac{\partial^2 u}{\partial x^2}$

Physical meaning: Oscillatory disturbances propagate at speed $v$.

  • Vibrating strings (guitar)
  • Sound waves in air
  • Electromagnetic waves

Note: Wave equation splits initial pulse into two counter-propagating waves (d'Alembert's solution)

Advection-Diffusion: Combined Effects

$\frac{\partial u}{\partial t} + c\frac{\partial u}{\partial x} = D\frac{\partial^2 u}{\partial x^2}$

Solution: Translates AND spreads

Example: Pollutant in a river
• Carried downstream (advection)
• Disperses over time (diffusion)

Parameters: c = 1.0, D = 0.5

t = 0.00

Comparing the Three

Time: 0.00
Advection (translates)
Diffusion (spreads)
Advection-Diffusion (both)

Reaction-Diffusion

$\frac{\partial u}{\partial t} = D\frac{\partial^2 u}{\partial x^2} + R(u)$

Physical meaning: Diffusion + local reactions/growth

  • Chemical reactions spreading through a medium
  • Population dynamics with spatial spread
  • Turing patterns: Spots and stripes in biology

The nonlinear reaction term $R(u)$ creates rich pattern-forming behavior.

Case Study: Electronic Circuits

Another source of differential equations from simple laws.

Component Law Type
Resistor $V = IR$ Algebraic
Capacitor $I = C\frac{dV}{dt}$ Differential
Inductor $V = L\frac{dI}{dt}$ Differential

RC and RLC Circuits

RC Circuit (first-order ODE):

$RC\frac{dV}{dt} + V = V_{\text{in}}$

Solution: exponential decay/growth

RLC Circuit (second-order ODE):

$LC\frac{d^2V}{dt^2} + RC\frac{dV}{dt} + V = V_{\text{in}}$

Solutions: oscillations, damping, resonance

Application: Synthesizers

Combinations of R, L, C components can shape signals.

  • Design circuits that produce specific waveforms
  • Approximate sounds (violin, piano, etc.)
  • Inverse problem: given a sound, what circuit produces it?
Modern audio processing and voice transfer use similar principles—often modeled as differential equations.

Don't Forget: Units Matter

Physical quantities have units.

ML approach:

  • Normalize everything
  • Mean 0, variance 1
  • Units are lost

Science approach:

  • Units encode meaning
  • Dimensional consistency
  • Scale reveals dominance
Dimensional analysis: By checking units, you can catch errors, derive relationships, and identify which terms dominate at different scales.

Solving Differential Equations

Once we have a differential equation, what do we do with it?

Analytical
Closed-form solutions
(linear, simple BCs)
Numerical
Discretize and compute
(nonlinear, complex)

The Detour Through Continuity

We started discrete: $\frac{\Delta u}{\Delta t}$

Went to continuous: $\frac{\partial u}{\partial t}$

Now go back to discrete for computation.

Why the detour?
Continuous formulation eliminates dependence on grid size, reveals mathematical structure, and allows powerful manipulations.

Differential Equations as Generative Models

Given initial conditions, predict the future:

  1. Know the state at time $t$
  2. Compute the change: $\frac{du}{dt} = f(u, t)$
  3. Update: $u(t + \Delta t) = u(t) + \frac{du}{dt} \cdot \Delta t$
  4. Repeat
This is autoregressive—the output at each step becomes the input for the next. Like language models generating text token by token.

All Models Make Assumptions

Every model simplifies reality. Common assumptions:

Assumption Reality
Newtonian fluid Non-Newtonian (toothpaste, blood)
Linear response Nonlinear at large amplitudes
Continuous medium Discrete molecules
Deterministic Stochastic/noisy

When Models Fail

"If a model doesn't work, ask: What assumptions did I make?"

Often we're not aware of our assumptions until we examine the building blocks.

Identifying violated assumptions is key to improving models.

"Assumption-Free" Models?

Deep learning promises to learn directly from data.

But there's no such thing as truly assumption-free:

  • Architecture choices are assumptions
  • Training data is an assumption
  • Optimization is an assumption
The inductive biases we discussed (space-time, conservation, linearity) help models generalize. Scientific ML encodes these explicitly.

Key Principles

  1. Space and time are special: $u(x,t)$ is the natural language of physics
  2. Change is fundamental: Derivatives quantify local change
  3. Conservation laws: Flux in - flux out = accumulation
  4. Linearity enables superposition: Solve parts, combine for whole
  5. Simple laws → complex equations: Viscosity + momentum = Navier-Stokes
  6. Nonlinearity is the challenge: Turbulence, chaos, no closed forms
  7. Units matter: They encode meaning and reveal scales
  8. All models make assumptions: Know yours

Looking Ahead

Next Lecture
Numerical methods
Discretization
Stability
Later
Nonlinear dynamics
Chaos
Strange attractors
Then
Deep learning for PDEs
PINNs, SINDy
Data-driven discovery

Building toward scientific machine learning

Questions?

See you next time!