Scientific Modeling Principles

and Differential Equations

ML for Science - Lecture 4

Understanding why differential equations are the language of physics

Today's Goal

Understand the principles behind scientific modeling:

Why do differential equations appear everywhere in physics?
What are the inductive biases of scientific modeling?
How do simple laws combine into complex equations?

        Key insight: These principles will help you understand scientific machine learning—incorporating physical constraints into ML models.
    

Recap: From Empirical Laws to Models

Lecture 2

Fitting data to known functional forms

Hooke's law: $F = kx$
Ohm's law: $V = IR$
Mapping one variable to another

Today

How do we derive those functional forms?

From first principles
Conservation laws
Functions of space and time

Why Simple Models?

A central theme in both ML and science: generalization

Fast inference: Quick predictions are useful
Generalize beyond training: Not just a lookup table
Interpretable: Understand what the model is doing

The tension: Simple enough to generalize, complex enough to capture reality

Inductive Bias #1: Space and Time

Space and time are special variables in physics.

We're interested in functions that map positions in space and time to measured quantities:

$u: (x, y, z, t) \mapsto \text{measured quantity}$

This is different from empirical laws like $V = IR$, which relate variables at a single instant.

Coordinate Systems and Units

Position: Requires a reference point (coordinate system)
Time: Requires a clock
Units: Meters, seconds, kilograms...

            Units are often ignored in ML (we just normalize). But in physics, units encode physical meaning and reveal which effects dominate.
        

The Only Constant is Change

"Nothing stays the same in space and time. Everything changes."

Yet we search for laws that remain constant.

And these laws describe how things change.

Quantifying Change: From Data

Given measurements $u_i$ at times $t_i$:

$\frac{\Delta u}{\Delta t} = \frac{u_i - u_{i-1}}{t_i - t_{i-1}}$

As $\Delta t \to 0$, slope $\to$ derivative

From Discrete to Continuous

From Data (Discrete)

$\displaystyle \frac{u_i - u_{i-1}}{t_i - t_{i-1}}$

Finite differences between measurements

Idealized (Continuous)

$\displaystyle \frac{\partial u}{\partial t} \; = \; \lim_{\Delta t \to 0} \frac{u(t+\Delta t) - u(t)}{\Delta t}$

Instantaneous rate of change

            Key insight: We go discrete → continuous for mathematical convenience, then back to discrete for computation.
        

Continuous vs. Discrete

Sciences	Computer Science
Continuous systems	Discrete systems
Smooth functions	States and transitions
Derivatives	Graphs

These are connected: continuous emerges from taking limits of discrete. Both are idealizations.

Galileo's Discovery

Dropping objects from the Tower of Pisa (allegedly):

Position $x(t)$ increases...
Velocity $v = \frac{dx}{dt}$ also increases...
But acceleration $a = \frac{d^2x}{dt^2}$ is constant!

$g \approx 9.8 \text{ m/s}^2$

The same for all objects, everywhere on Earth (approximately).

Newton's Second Law

$F = ma = m\frac{d^2 x}{dt^2}$

For free fall: $F = mg$, so:

$\frac{d^2 x}{dt^2} = g$

This is a differential equation—an equation involving derivatives.

Solving the Equation

Integrate once:

$\frac{dx}{dt} = gt + v_0$

(velocity)

Integrate twice:

$x = \frac{1}{2}gt^2 + v_0 t + x_0$

(position)

A parabola! From a simple differential equation, we derived the shape of a projectile's trajectory.

Adding Complexity

With air resistance (drag proportional to velocity):

$m\frac{d^2x}{dt^2} = -mg - b\frac{dx}{dt}$

First derivative appears (drag term)
Solution involves exponentials
Pendulums, springs → sine waves

This is what you learn in dynamics courses: finding differential equations from forces.

Inductive Bias #2: Conservation Laws

Conservation laws are everywhere in physics.

The fundamental idea is simple:

Flux in $-$ Flux out $=$ Accumulation

Think of people in a room: if 10 enter and 3 leave, the number increases by 7.

Conservation: Interactive Demo

| |

Flux In: 2/s

Inside: 5

Flux Out: 1/s

The Math: 1D Conservation

Consider a quantity with density $\rho(x,t)$ in a region $[x, x+\Delta x]$:

Accumulation: $\frac{\partial \rho}{\partial t} \Delta x$
Flux in at $x$: $F(x,t)$
Flux out at $x+\Delta x$: $F(x+\Delta x, t)$

Conservation requires:

$F(x) - F(x+\Delta x) = \frac{\partial \rho}{\partial t} \Delta x$

The Continuity Equation

Divide by $\Delta x$ and take the limit $\Delta x \to 0$:

$-\frac{\partial F}{\partial x} = \frac{\partial \rho}{\partial t}$

Or equivalently:

$\frac{\partial \rho}{\partial t} + \frac{\partial F}{\partial x} = 0$

This is the conservation law form or continuity equation.

Examples of Conservation

Conservation of...	Flux $F$
Mass	$F = \rho v$ (density $\times$ velocity)
Traffic (cars)	$F = \rho v(\rho)$ (speed depends on congestion)
Momentum	$F = \rho v^2 + p$ (includes pressure)
Energy	Heat flux, work done

Higher Dimensions

In 3D with velocity field $\mathbf{v} = (u, v, w)$:

$\frac{\partial \rho}{\partial t} + \nabla \cdot (\rho \mathbf{v}) = 0$

where the divergence is:

$\nabla \cdot (\rho \mathbf{v}) = \frac{\partial(\rho u)}{\partial x} + \frac{\partial(\rho v)}{\partial y} + \frac{\partial(\rho w)}{\partial z}$

This is why we study calculus: to manipulate derivatives and combine them in powerful ways.

Inductive Bias #3: Linearity

Linear systems allow superposition.

A function $L$ is linear if:

$L(aX + bY) = aL(X) + bL(Y)$

Why Linearity Matters

If a system is linear, we can:

Break the problem into simpler parts
Solve each part separately
Add the solutions to get the whole

Instead of understanding the whole (complicated), understand the parts (simple) and combine them.

The Reality: Nonlinearity

Most real systems are nonlinear.

But whenever possible, we try to:

Linearize: Approximate near an operating point
Find linear subsystems: Isolate what we can
Perturb about equilibria: Study small deviations

Nonlinearity creates the hard problems: turbulence, chaos, no closed-form solutions.

Case Study: Navier-Stokes Equations

The equations that describe fluid motion.

They combine:

                Newton's Law of Viscosity

                (empirical law)

                $\tau = \mu \frac{\partial u}{\partial y}$

                Conservation of Momentum

                (Newton's 2nd law for fluids)

                $\frac{D(\rho \mathbf{v})}{Dt} = \text{forces}$

The Equations

For an incompressible Newtonian fluid:

$\rho\left(\underbrace{\frac{\partial \mathbf{u}}{\partial t}}_{\text{local}} + \underbrace{\mathbf{u} \cdot \nabla \mathbf{u}}_{\text{convective}}\right) = \underbrace{-\nabla p}_{\text{pressure}} + \underbrace{\mu \nabla^2 \mathbf{u}}_{\text{viscous}} + \mathbf{f}$

Solve for: velocity field $\mathbf{u}(x,y,z,t)$
Given: pressure $p$, viscosity $\mu$, body forces $\mathbf{f}$

Interpreting the Terms

Term	Physical Meaning
$\frac{\partial \mathbf{u}}{\partial t}$	Local acceleration (change at fixed point)
$\mathbf{u} \cdot \nabla \mathbf{u}$	Convective acceleration (nonlinear!)
$-\nabla p$	Pressure forces (compression)
$\mu \nabla^2 \mathbf{u}$	Viscous forces (shearing between layers)

The Lagrangian Derivative

The combination:

$\frac{D\mathbf{u}}{Dt} = \frac{\partial \mathbf{u}}{\partial t} + \mathbf{u} \cdot \nabla \mathbf{u}$

is the material derivative.

$\frac{\partial \mathbf{u}}{\partial t}$: change at a fixed point
$\frac{D\mathbf{u}}{Dt}$: change following a moving particle

The Nonlinearity Problem

The term $\mathbf{u} \cdot \nabla \mathbf{u}$ is nonlinear in $\mathbf{u}$.

This single term:

Makes analytical solutions nearly impossible
Prevents proof of existence/uniqueness (Millennium Prize!)
Gives rise to turbulence and chaos

Yet we build planes and predict weather by solving these equations numerically.

Why Study Navier-Stokes?

It's an archetypal equation because:

Appears everywhere (weather, aerodynamics, blood flow)
Contains all PDE challenges (nonlinearity, coupled fields)
Full of open questions (basic math properties unproven)

Companies like DeepMind spend millions making Navier-Stokes solvers faster.

Building Block PDEs

Before tackling Navier-Stokes, understand simpler canonical PDEs.

                Advection

                Transport

                Diffusion

                Spreading

                Wave

                Oscillation

The Advection Equation

$\frac{\partial u}{\partial t} + c\frac{\partial u}{\partial x} = 0$

Solution: $u(x,t) = u_0(x - ct)$

The initial profile translates at speed $c$.

Physical: Quantity transported without changing shape (e.g., wave on a string)

t = 0.00

The Diffusion Equation

$\frac{\partial u}{\partial t} = D\frac{\partial^2 u}{\partial x^2}$

Solution: Gaussian spreads as $\sigma(t) = \sqrt{\sigma_0^2 + 2Dt}$

                    Key insight: Second derivative = curvature

                    • Local min → $u$ increases

                    • Local max → $u$ decreases

t = 0.00

The Wave Equation

$\frac{\partial^2 u}{\partial t^2} = v^2 \frac{\partial^2 u}{\partial x^2}$

Physical meaning: Oscillatory disturbances propagate at speed $v$.

Vibrating strings (guitar)
Sound waves in air
Electromagnetic waves

Note: Wave equation splits initial pulse into two counter-propagating waves (d'Alembert's solution)

Advection-Diffusion: Combined Effects

$\frac{\partial u}{\partial t} + c\frac{\partial u}{\partial x} = D\frac{\partial^2 u}{\partial x^2}$

Solution: Translates AND spreads

Example: Pollutant in a river
• Carried downstream (advection)
• Disperses over time (diffusion)

Parameters: c = 1.0, D = 0.5

t = 0.00

Comparing the Three

Time: 0.00

■ Advection (translates)

■ Diffusion (spreads)

■ Advection-Diffusion (both)

Reaction-Diffusion

$\frac{\partial u}{\partial t} = D\frac{\partial^2 u}{\partial x^2} + R(u)$

Physical meaning: Diffusion + local reactions/growth

Chemical reactions spreading through a medium
Population dynamics with spatial spread
Turing patterns: Spots and stripes in biology

The nonlinear reaction term $R(u)$ creates rich pattern-forming behavior.

Case Study: Electronic Circuits

Another source of differential equations from simple laws.

Component	Law	Type
Resistor	$V = IR$	Algebraic
Capacitor	$I = C\frac{dV}{dt}$	Differential
Inductor	$V = L\frac{dI}{dt}$	Differential

RC and RLC Circuits

RC Circuit (first-order ODE):

$RC\frac{dV}{dt} + V = V_{\text{in}}$

Solution: exponential decay/growth

RLC Circuit (second-order ODE):

$LC\frac{d^2V}{dt^2} + RC\frac{dV}{dt} + V = V_{\text{in}}$

Solutions: oscillations, damping, resonance

Application: Synthesizers

Combinations of R, L, C components can shape signals.

Design circuits that produce specific waveforms
Approximate sounds (violin, piano, etc.)
Inverse problem: given a sound, what circuit produces it?

            Modern audio processing and voice transfer use similar principles—often modeled as differential equations.
        

Don't Forget: Units Matter

Physical quantities have units.

ML approach:

Normalize everything
Mean 0, variance 1
Units are lost

Science approach:

Units encode meaning
Dimensional consistency
Scale reveals dominance

Dimensional analysis: By checking units, you can catch errors, derive relationships, and identify which terms dominate at different scales.

Solving Differential Equations

Once we have a differential equation, what do we do with it?

Analytical
Closed-form solutions
(linear, simple BCs)

                Numerical

                Discretize and compute

                (nonlinear, complex)

The Detour Through Continuity

We started discrete: $\frac{\Delta u}{\Delta t}$

Went to continuous: $\frac{\partial u}{\partial t}$

Now go back to discrete for computation.

Why the detour?
Continuous formulation eliminates dependence on grid size, reveals mathematical structure, and allows powerful manipulations.

Differential Equations as Generative Models

Given initial conditions, predict the future:

Know the state at time $t$
Compute the change: $\frac{du}{dt} = f(u, t)$
Update: $u(t + \Delta t) = u(t) + \frac{du}{dt} \cdot \Delta t$
Repeat

            This is autoregressive—the output at each step becomes the input for the next. Like language models generating text token by token.
        

All Models Make Assumptions

Every model simplifies reality. Common assumptions:

Assumption	Reality
Newtonian fluid	Non-Newtonian (toothpaste, blood)
Linear response	Nonlinear at large amplitudes
Continuous medium	Discrete molecules
Deterministic	Stochastic/noisy

When Models Fail

"If a model doesn't work, ask: What assumptions did I make?"

Often we're not aware of our assumptions until we examine the building blocks.

Identifying violated assumptions is key to improving models.

"Assumption-Free" Models?

Deep learning promises to learn directly from data.

But there's no such thing as truly assumption-free:

Architecture choices are assumptions
Training data is an assumption
Optimization is an assumption

The inductive biases we discussed (space-time, conservation, linearity) help models generalize. Scientific ML encodes these explicitly.

Key Principles

Space and time are special: $u(x,t)$ is the natural language of physics
Change is fundamental: Derivatives quantify local change
Conservation laws: Flux in - flux out = accumulation
Linearity enables superposition: Solve parts, combine for whole
Simple laws → complex equations: Viscosity + momentum = Navier-Stokes
Nonlinearity is the challenge: Turbulence, chaos, no closed forms
Units matter: They encode meaning and reveal scales
All models make assumptions: Know yours

Looking Ahead

            Next Lecture

            Numerical methods

            Discretization

            Stability

Later
Nonlinear dynamics
Chaos
Strange attractors

Then
Deep learning for PDEs
PINNs, SINDy
Data-driven discovery

Building toward scientific machine learning

Questions?

See you next time!