Introduction

FeRx-NLME is a high-performance Nonlinear Mixed Effects (NLME) modeling engine written in Rust. It is designed for population pharmacokinetic (PopPK) analysis, implementing the same statistical methodology as established tools like NONMEM, Monolix, and Pumas.

Key Features

FOCE/FOCEI estimation -- First-Order Conditional Estimation with optional interaction, the gold standard for PopPK
SAEM estimation -- Stochastic Approximation EM for robust convergence on complex models
Analytical PK solutions -- Built-in one- and two-compartment models (IV bolus, oral, infusion) with numerical stability guarantees
ODE solver -- Dormand-Prince RK45 adaptive integrator for custom kinetic models (e.g. Michaelis-Menten)
Automatic differentiation -- Enzyme-based reverse-mode AD for fast, exact gradients
Simple model DSL -- Declarative .ferx model files that read like equations
NONMEM-compatible data -- Reads standard NONMEM CSV datasets directly
Covariate support -- Time-constant and time-varying covariates with automatic detection
Parallel estimation -- Per-subject computations parallelized via Rayon

How It Compares

Feature	FeRx	NONMEM	Monolix	Pumas
Language	Rust	Fortran	C++	Julia
FOCE/FOCEI	Yes	Yes	No	Yes
SAEM	Yes	No	Yes	Yes
Analytical PK	Yes	Yes	Yes	Yes
ODE models	Yes	Yes	Yes	Yes
Auto-diff	Yes (Enzyme)	No	No	Yes (ForwardDiff)
Open source	Yes	No	No	No

Architecture Overview

FeRx uses a two-level optimization structure:

Outer loop: Optimizes population parameters (theta, omega, sigma) using NLopt or built-in BFGS
Inner loop: For each subject, finds empirical Bayes estimates (EBEs) of random effects by minimizing individual negative log-likelihood

Parameters are internally transformed for unconstrained optimization: theta and sigma are log-transformed, and omega uses Cholesky factorization to guarantee positive-definiteness.

Project Structure

src/
  types.rs          -- Core data structures
  api.rs            -- Public API (fit, simulate, predict)
  parser/           -- .ferx model file parser
  pk/               -- Analytical PK solutions
  ode/              -- ODE solver and predictions
  estimation/       -- FOCE, FOCEI, SAEM algorithms
  stats/            -- Likelihood and residual computations
  io/               -- Data reading and output writing
  ad/               -- Automatic differentiation
  bin/run_model.rs  -- CLI binary

License

Getting Started

This section covers installing FeRx and running your first model.

Installation -- Setting up the Rust toolchain and building FeRx
Quick Start -- Running a complete PopPK analysis in under 5 minutes

Installation

FeRx requires a nightly Rust toolchain with the Enzyme LLVM plugin for automatic differentiation. As of 2026, Enzyme is not yet distributed via rustup, so a one-time plugin build is required.

Pick your platform:

Linux — fully supported, one-time ~30 min plugin build
macOS — supported with caveats
Windows — not supported (see why below)

Linux

Tested on Ubuntu 22.04. Other distributions (Debian, Fedora, Arch) should work with adjustments to the package manager commands.

1. Install rustup + upstream nightly

Do not use snap's rustup — its filesystem confinement breaks on non-standard home directories (common on enterprise servers).

# Remove snap rustup if you had it:
sudo snap remove rustup

# Official installer
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"

rustup toolchain install nightly

2. Install system dependencies and matching LLVM

Check which LLVM major version nightly needs:

rustc +nightly --version --verbose | grep LLVM
# e.g. "LLVM version: 22.1.2" — major is 22

Use that major version (<MAJOR>) below.

sudo apt install -y cmake ninja-build clang libssl-dev pkg-config \
                    python3 build-essential curl git libzstd-dev

# Install matching LLVM from apt.llvm.org (Ubuntu's defaults lag behind):
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh <MAJOR>

# Fix GPG keyring permissions if apt warns:
sudo chmod 644 /etc/apt/trusted.gpg.d/apt.llvm.org.asc
sudo apt update

sudo apt install -y llvm-<MAJOR>-dev clang-<MAJOR>

For other distros, install the matching llvm-dev + clang packages from your package manager. The LLVM major version must exactly match what rustc reports.

3. Build the Enzyme plugin

git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build

cmake -G Ninja .. \
  -DLLVM_DIR=/usr/lib/llvm-<MAJOR>/lib/cmake/llvm \
  -DENZYME_CLANG=OFF \
  -DENZYME_FLANG=OFF
ninja
# 15–30 min

4. Install the plugin into nightly's sysroot

This location is not obvious — rustc looks in lib/rustlib/<target>/lib/, not just lib/. Despite the error wording ("folder not found"), it's searching for a file:

SYSROOT=$(rustc +nightly --print sysroot)
TARGET=x86_64-unknown-linux-gnu   # or aarch64-unknown-linux-gnu on ARM

cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.so \
   $SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.so

Note the filename rewrite: LLVMEnzyme-<N>.so → libEnzyme-<N>.so (with lib prefix).

5. Register the toolchain as `enzyme`

ferx's build system pins to a toolchain named enzyme:

rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme --version

6. Verify

rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head

Expected: error[E0601]: main function not found. That's the success signal. If you see autodiff backend not found in the sysroot, the .so is missing or in the wrong place (step 4).

Multi-user servers

For shared Linux servers, stage the built toolchain in /opt/rust-nightly so all users can share one build:

sudo mkdir -p /opt/rust-nightly
sudo cp -a ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/. /opt/rust-nightly/
sudo chown -R root:root /opt/rust-nightly
sudo chmod -R a+rX /opt/rust-nightly
sudo chmod a+rx /opt/rust-nightly/bin/*

Each user then links the shared toolchain into their own rustup:

# per user, once
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain none
rustup toolchain link enzyme /opt/rust-nightly

For R users, add to ~/.Renviron:

PATH=/opt/rust-nightly/bin:${HOME}/.cargo/bin:${PATH}
RUSTUP_TOOLCHAIN=enzyme

macOS

Supported on both Intel and Apple Silicon, with caveats around LLVM installation.

1. Install Xcode Command Line Tools

xcode-select --install

2. Install rustup + upstream nightly

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"

rustup toolchain install nightly

3. Install matching LLVM via Homebrew

Check which LLVM major version nightly needs:

rustc +nightly --version --verbose | grep LLVM

Homebrew only ships a handful of LLVM versions at a time. If the version you need is on Homebrew:

brew install llvm@<MAJOR>
# Path will be /opt/homebrew/opt/llvm@<MAJOR> (Apple Silicon)
# or           /usr/local/opt/llvm@<MAJOR>     (Intel)

If Homebrew doesn't have your version, you'll need to build LLVM from source (much longer — consider sticking with a slightly older nightly that matches an available LLVM version).

4. Build the Enzyme plugin

git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build

# Adjust paths for your brew prefix (/opt/homebrew or /usr/local)
cmake -G Ninja .. \
  -DLLVM_DIR=$(brew --prefix)/opt/llvm@<MAJOR>/lib/cmake/llvm \
  -DENZYME_CLANG=OFF \
  -DENZYME_FLANG=OFF
ninja

You may need to install ninja if not present: brew install ninja cmake.

5. Install the plugin into nightly's sysroot

SYSROOT=$(rustc +nightly --print sysroot)

# On Apple Silicon:
TARGET=aarch64-apple-darwin
# On Intel:
# TARGET=x86_64-apple-darwin

# Note: on macOS the shared library extension is .dylib, not .so
cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.dylib \
   $SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.dylib

6. Register and verify

rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
# Expect: error[E0601]: `main` function not found

macOS caveats

Apple Silicon (M1/M2/M3/M4): use aarch64-apple-darwin as TARGET. Intel Macs use x86_64-apple-darwin
System Integrity Protection (SIP): If you see code-signing errors loading the .dylib during rustc invocation, try sudo codesign --force --sign - <path-to-libEnzyme.dylib> — should be rare on dev machines
Nightly toolchain distribution mismatches: Apple Silicon nightlies occasionally lag x86_64 by a day or two; if LLVM version mismatches after rustup update, prefer installing a specific dated nightly

Windows

Windows is not currently supported. The blockers are:

EnzymeAD plugin build on MSVC is not well-tested. The Enzyme project primarily targets Linux with LLVM clang/gcc. While Windows builds exist in theory, the toolchain integration (rustc sysroot path conventions, plugin discovery, MSVC vs MinGW linking) has known issues that haven't been worked through.
Rust autodiff feature gate interactions with MSVC codegen. The -Z autodiff=Enable path has not been exercised on the x86_64-pc-windows-msvc target. Anecdotal reports from the Rust autodiff tracking issue show crashes or "backend not found" errors on Windows even when the plugin .dll is present.
We haven't set up CI for Windows. Without a green CI signal we can't promise the build works.

Workarounds for Windows users

WSL2 (recommended): Install Ubuntu under WSL2 and follow the Linux instructions. Performance is near-native for CPU-bound workloads like model fitting.
Docker: Use the forthcoming ferx-nlme Docker image (coming soon) which ships with the toolchain pre-installed.
Remote dev: SSH into a Linux server or cloud VM.

Future Windows support

Once upstream Enzyme integration lands in rustup (tracked at rust-lang/rust autodiff tracking issue), Windows support should become straightforward since the plugin distribution problem goes away. No ETA; the feature is still experimental in rustc.

If you're a Windows developer who would like to help, a CI run against x86_64-pc-windows-msvc + contributed docs for that platform would be very welcome.

Building ferx-nlme from source

Once your Enzyme toolchain is set up:

git clone https://github.com/InsightRX/ferx-nlme
cd ferx-nlme

cargo build --release --features autodiff
# Binary at target/release/ferx

Build options

# Debug build (faster compile, slower runtime)
cargo build --features autodiff

# Quick type-check without building
cargo check --features autodiff

# Lints
cargo clippy --features autodiff

# CI build without autodiff (uses finite differences — no Enzyme needed)
cargo build --release --no-default-features --features ci

The ci feature is useful for development on machines without the full Enzyme toolchain — at the cost of much slower gradient computation.

Verify the build

cargo run --release --features autodiff --bin ferx -- examples/warfarin.ferx --simulate

Should print a successful model fit with parameter estimates.

Installing the ferx R package

Ensure the Enzyme toolchain is set up (above). Then:

devtools::install_github("InsightRX/ferx")

The R package's build is driven by its Makevars, which invokes cargo and resolves the enzyme toolchain via rustup. Both must be set up correctly in your shell/Renviron before calling install_github.

See the ferx R package README for API usage.

Cancelling a running fit (Ctrl-C)

ferx_fit() runs the estimator on a worker thread and polls for R interrupts on the main thread every ~100 ms, so Ctrl-C (or RStudio's red stop button) aborts the fit cleanly. The worker exits at the next safe checkpoint — typically within a second or two, but up to one inner-loop evaluation for heavy ODE models. The call then returns with an R error:

Error: ferx_fit: cancelled by user

Ctrl-Z (SIGTSTP) will not abort the fit — it suspends the whole R process to the shell. Use Ctrl-C instead.

Dependencies

FeRx depends on these crates (managed automatically by Cargo):

Crate	Purpose
`nalgebra`	Linear algebra (matrices, Cholesky)
`nlopt`	Nonlinear optimization (SLSQP, L-BFGS, MMA)
`rayon`	Parallel computation
`rand`, `rand_distr`	Random number generation (SAEM, SIR)
`csv`	CSV data file reading
`regex`	Model file expression parsing

The nlopt crate requires the NLopt C library. Most platforms handle this automatically; if the build fails on NLopt, install it via your system package manager:

# macOS
brew install nlopt

# Ubuntu/Debian
sudo apt install libnlopt-dev

# Fedora
sudo dnf install NLopt-devel

Troubleshooting

`"error: the option 'Z' is only accepted on the nightly compiler"`

Your shell (or R) is finding a non-nightly rustc. Check rustc --version and your PATH. For R, verify Sys.which("rustc") and ~/.Renviron.

`"autodiff backend not found in the sysroot: failed to find a libEnzyme-<N> folder"`

Despite the wording ("folder"), rustc is searching for a file. Causes:

Wrong directory: the .so/.dylib is in <sysroot>/lib/ instead of <sysroot>/lib/rustlib/<target>/lib/
LLVM version mismatch: rebuild Enzyme against the LLVM version rustc reports
Filename: must be libEnzyme-<MAJOR>.so (Linux) or libEnzyme-<MAJOR>.dylib (macOS), with lib prefix

`"custom toolchain 'enzyme' specified in override file ... is not installed"`

Run rustup toolchain link enzyme <path-to-nightly-sysroot>.

`"not a directory: '/<path>/lib'"` from `rustup toolchain link`

Permission issue — the user running toolchain link can't read the target path. On shared installs, run sudo chmod -R a+rX /opt/rust-nightly.

`"incorrect value 'X' for unstable option 'autodiff'"`

Valid autodiff values change between nightly builds. Test with:

rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head

If Enable is rejected, try LooseTypes, Inline, or PrintTA.

`"Enzyme: cannot handle (forward) unknown intrinsic llvm.maximumnum"`

Recent rustc lowers f64::max()/f64::min() to intrinsics Enzyme can't yet differentiate. This is a ferx-nlme code-level issue — AD-instrumented functions must use manual if expressions. Should not happen on released ferx-nlme versions; if it does, please file an issue.

`"cargo is unavailable for the active toolchain"` (info, not error)

Cargo wasn't copied into your linked toolchain. Either copy it (cp ~/.cargo/bin/cargo /opt/rust-nightly/bin/) or ignore — rustup falls back to nightly's cargo, which works.

Refreshing when upstream nightly rolls forward

If a new ferx-nlme release references stdlib items your cached toolchain doesn't have (e.g. autodiff_forward not found), rebuild /opt/rust-nightly against the current nightly, and rebuild Enzyme if the LLVM major version changed.

Quick Start

This walkthrough fits a one-compartment oral PK model to warfarin data.

1. Create a model file

Save the following as warfarin.ferx:

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  maxiter    = 300
  covariance = true

2. Prepare your data

Data must be in NONMEM CSV format. The minimum required columns are ID, TIME, and DV. Dosing records use EVID=1 with AMT:

ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,1,14.42,0,.,.,0
1,2,17.56,0,.,.,0
1,4,15.23,0,.,.,0
1,8,10.15,0,.,.,0
1,12,6.75,0,.,.,0
1,24,2.24,0,.,.,0

3. Run the fit

ferx warfarin.ferx --data warfarin.csv

4. Interpret the output

The console output shows the estimation progress and final results:

Starting FOCE estimation...
  10 subjects, 110 observations
  3 thetas, 3 etas, 1 sigmas
...
Final OFV = -280.1838

============================================================
NONLINEAR MIXED EFFECTS MODEL ESTIMATION
============================================================

Converged: YES
Estimation method: FOCE

--- Objective Function ---
OFV:  -280.1838
AIC:  -266.1838
BIC:  -247.2804

--- THETA Estimates ---
Parameter            Estimate           SE       %RSE
----------------------------------------------------
TVCL                 0.132735     0.014549       11.0
TVV                  7.694842     0.293028        3.8
TVKA                 0.757498     0.034986        4.6

--- OMEGA Estimates (variances) ---
  OMEGA(1,1) = 0.028584  (CV% = 16.9)  SE = 0.006394
  OMEGA(2,2) = 0.009613  (CV% = 9.8)   SE = 0.002165
  OMEGA(3,3) = 0.340868  (CV% = 58.4)  SE = 0.076351

--- SIGMA Estimates ---
  SIGMA(1) = 0.010638  SE = 0.000788

5. Output files

Three files are generated:

File	Contents
`warfarin-sdtab.csv`	Per-observation diagnostics (PRED, IPRED, CWRES, IWRES, ETAs)
`warfarin-fit.yaml`	Parameter estimates with standard errors in YAML format
`warfarin-timing.txt`	Wall-clock estimation time

Next Steps

Learn the full Model File Reference
Explore Estimation Methods (FOCE, FOCEI, SAEM)
See more Examples

Model File Reference

FeRx models are defined in .ferx files using a declarative DSL. Each file is organized into blocks, each denoted by a [block_name] header.

Block Overview

Block	Required	Purpose
`[parameters]`	Yes	Define theta, omega, and sigma parameters
`[individual_parameters]`	Yes	Map population parameters to individual PK parameters
`[structural_model]`	Yes	Specify the PK model (analytical or ODE)
`[error_model]`	Yes	Define the residual error model
`[odes]`	If ODE	ODE right-hand-side equations
`[fit_options]`	No	Configure estimation method and optimizer
`[simulation]`	No	Define a simulation trial design

Minimal Example

[parameters]
  theta TVCL(0.1, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  sigma ADD_ERR ~ 1.0

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)

[structural_model]
  pk one_cpt_iv_bolus(cl=CL, v=V)

[error_model]
  DV ~ additive(ADD_ERR)

Lines beginning with # are treated as comments.

Parameters

The [parameters] block defines all model parameters: fixed effects (theta), between-subject variability (omega), and residual error (sigma).

Theta (Fixed Effects)

theta NAME(initial_value, lower_bound, upper_bound)

NAME: Parameter name (used in [individual_parameters] expressions)
initial_value: Starting value for estimation
lower_bound: Lower bound constraint (must be > 0; parameters are log-transformed internally)
upper_bound: Upper bound constraint

Example:

theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)

Omega (Between-Subject Variability)

Diagonal omega

omega NAME ~ variance

NAME: Random effect name (used in [individual_parameters] as ETA_XXX)
variance: Initial variance estimate (diagonal element of the omega matrix)

Example:

omega ETA_CL ~ 0.07
omega ETA_V  ~ 0.02
omega ETA_KA ~ 0.40

Each variance represents the between-subject variability for that parameter. The coefficient of variation (CV%) is approximately sqrt(variance) * 100 for log-normally distributed parameters. For example, omega ETA_CL ~ 0.09 corresponds to ~30% CV.

Block omega (correlated random effects)

To estimate correlations between random effects, use block_omega:

block_omega (NAME1, NAME2, ...) = [lower_triangle_values]

The values are the lower triangle of the covariance matrix, specified row-wise. For a 2x2 block:

block_omega (ETA_CL, ETA_V) = [var_CL, cov_CL_V, var_V]

For a 3x3 block:

block_omega (ETA_CL, ETA_V, ETA_KA) = [var_CL, cov_CL_V, var_V, cov_CL_KA, cov_V_KA, var_KA]

Visualising the lower triangle

For an N×N block you supply N·(N+1)/2 values, walked row by row through the lower triangle (row 1 has 1 value, row 2 has 2, …, row N has N). Drawing it out as a matrix makes the layout obvious:

          ETA_1   ETA_2   ETA_3   ETA_4
        ┌                                ┐
ETA_1   │  v[0]                          │
ETA_2   │  v[1]   v[2]                   │
ETA_3   │  v[3]   v[4]    v[5]           │
ETA_4   │  v[6]   v[7]    v[8]    v[9]   │
        └                                ┘

The diagonal entries are the variances; the off-diagonals are the covariances (which the optimiser estimates — set them to 0.0 as initial values and the fit will recover correlations from the data).

Converting diagonal omegas to a block

A common pattern: you have all-diagonal omegas and want to replace them with a single fully-correlated block. Each omega variance becomes a diagonal entry; off-diagonals start at 0.0. For example:

omega ETA_CL ~ 0.1
omega ETA_V1 ~ 0.1
omega ETA_Q  ~ 0.1
omega ETA_V2 ~ 0.1

becomes a 4×4 block (10 values: 1+2+3+4):

block_omega (ETA_CL, ETA_V1, ETA_Q, ETA_V2) = [
  0.1,
  0.0, 0.1,
  0.0, 0.0, 0.1,
  0.0, 0.0, 0.0, 0.1
]

(Multi-line and single-line layouts are equivalent — line breaks inside [ ... ] are ignored.) Reading the rows: row 1 is var(CL), row 2 is cov(CL,V1) var(V1), row 3 is cov(CL,Q) cov(V1,Q) var(Q), row 4 is cov(CL,V2) cov(V1,V2) cov(Q,V2) var(V2).

Block omega regularises poorly-identified parameters by letting them share variability — see the optimisation FAQ for a worked example where switching from diagonal to block_omega(3) rescues a fit that diagonal-omega couldn't converge.

Mixing diagonal and block

You can mix diagonal and block omega specifications. Diagonal omegas specify uncorrelated random effects, while block omegas estimate the full covariance sub-matrix:

block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40

This estimates a 3x3 omega where ETA_CL and ETA_V are correlated (2x2 block), but ETA_KA is uncorrelated with both.

Declaration order

The order of omega and block_omega lines in the [parameters] block determines the ETA indexing throughout the model: in the omega matrix and all output. For example:

block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40

produces ETA order [ETA_CL, ETA_V, ETA_KA] (indices 1, 2, 3), while:

omega ETA_KA ~ 0.40
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]

produces [ETA_KA, ETA_CL, ETA_V] (indices 1, 2, 3). The [individual_parameters] block should list assignments in the same order for clarity, though the parameter mapping is by name, not position.

Kappa (Inter-Occasion Variability)

Inter-Occasion Variability (IOV) is declared with kappa (independent per-parameter) or block_kappa (correlated across parameters). Kappa parameters must be paired with iov_column in [fit_options] and an occasion column in the dataset.

Diagonal kappa — Option A

kappa NAME ~ variance
kappa NAME ~ variance FIX

Each kappa line adds one diagonal element to the IOV omega matrix. Occasions are independent.

Example:

kappa KAPPA_CL ~ 0.05
kappa KAPPA_V  ~ 0.03
kappa KAPPA_CL ~ 0.02 FIX

Block kappa — Option B (correlated IOV)

block_kappa (NAME1, NAME2, ...) = [lower_triangle_values]
block_kappa (NAME1, NAME2, ...) = [lower_triangle_values] FIX

Mirrors the block_omega syntax — values are the lower triangle of the IOV covariance matrix, row-wise. Use this when IOV effects are expected to be correlated (e.g. correlated occasion shifts in CL and V).

Example (2×2 block):

block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]

where 0.05 = Var(KAPPA_CL), 0.01 = Cov(KAPPA_CL, KAPPA_V), 0.03 = Var(KAPPA_V).

Using kappas in individual parameters

Reference kappa names exactly like BSV etas in [individual_parameters]:

[individual_parameters]
  CL = TVCL * exp(ETA_CL + KAPPA_CL)
  V  = TVV  * exp(ETA_V  + KAPPA_V)
  KA = TVKA * exp(ETA_KA)           # no IOV on absorption

Kappas can be combined freely — a parameter can carry BSV only, IOV only, or both.

Mixed diagonal and block kappa

You can mix kappa (uncorrelated) and block_kappa (correlated) declarations in the same model:

block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]
kappa KAPPA_KA ~ 0.10

A name may not appear in both kappa and block_kappa — this is a parse error.

Declaration order

Kappa declaration order determines the IOV omega matrix layout and the kappa column order in output. Kappas follow all BSV etas in the internal indexing: if a model has 3 BSV etas and 2 kappas, the kappas sit at indices 4 and 5.

Parameter-level correlation in output

When a block_omega (or block_kappa) is estimated, ferx reports a parameter-level correlation for each off-diagonal pair in the console and YAML output. This differs from the eta-level (normal-scale) correlation ω_ij / √(ω_ii · ω_jj):

Both etas lognormal (`THETA * exp(ETA)`)	`(exp(ω_ij) − 1) / √((exp(ω_ii) − 1)(exp(ω_jj) − 1))`
Both etas additive (`THETA + ETA`)	`ω_ij / √(ω_ii · ω_jj)` (same as eta-level)
Mixed or complex expressions	Falls back to eta-level; a warning is added to `FitResult.warnings`

The formula for lognormal pairs is the standard bivariate lognormal identity and reflects the correlation between the actual PK/PD parameters (e.g. CL and V) rather than their underlying normal variates.

The result is exposed as FitResult.omega_param_corr (BSV) and FitResult.omega_iov_param_corr (IOV), and is used wherever ferx prints a corr or correlation value for a block omega pair.

Sigma (Residual Error)

sigma NAME ~ value

NAME: Residual error parameter name (referenced in [error_model])
value: Initial value

Example:

sigma PROP_ERR ~ 0.01
sigma ADD_ERR  ~ 1.0

The interpretation of sigma depends on the error model:

Error Model	Sigma Meaning
Additive	Standard deviation of additive error
Proportional	Coefficient of proportional error
Combined	First sigma = proportional coefficient, second = additive SD

Complete Examples

Diagonal omega (no correlations):

[parameters]
  theta TVCL(0.134, 0.001, 10.0)
  theta TVV(8.1, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  omega ETA_CL ~ 0.07
  omega ETA_V  ~ 0.02
  omega ETA_KA ~ 0.40

  sigma PROP_ERR ~ 0.01

Block omega (correlated CL and V):

[parameters]
  theta TVCL(0.134, 0.001, 10.0)
  theta TVV(8.1, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
  omega ETA_KA ~ 0.40

  sigma PROP_ERR ~ 0.01

Individual Parameters

The [individual_parameters] block defines how population parameters (theta), random effects (eta), and covariates combine to produce individual PK parameters.

Syntax

PARAM = expression

Each line assigns a PK parameter using an arithmetic expression that can reference:

Theta parameters -- names defined in [parameters] (e.g., TVCL, TVV)
Eta random effects -- names defined as omega parameters (e.g., ETA_CL, ETA_V)
Covariates -- column names from the data file (e.g., WT, CRCL)
Constants -- numeric literals (e.g., 70, 0.75)

Supported Operators and Functions

Operator/Function	Example
`+`, `-`, `*`, `/`	`TVCL * WT / 70`
`^` (power)	`(WT/70)^0.75`
`exp()`	`exp(ETA_CL)`
`log()`, `ln()`	`log(TVCL)`
`sqrt()`	`sqrt(WT)`
`abs()`	`abs(ETA_CL)`
Parentheses	`TVCL * (WT/70)^0.75`
Comparisons (in `if` conditions)	`WT > 70`, `SEX == 1`, `AGE != 0`
Logical (in `if` conditions)	`&&` (and), `\|\|` (or), `!` (not)
Inline conditional	`if (SEX == 1) TVCL * 1.5 else TVCL`

Conditional Logic (`if` / `else`)

Two forms are supported and may be combined freely.

Block form

[individual_parameters]
  if (WT > 70) {
    CL = TVCL * (WT / 70)^0.75 * exp(ETA_CL)
  } else if (SEX == 1) {
    CL = TVCL * 1.2 * exp(ETA_CL)
  } else {
    CL = TVCL * exp(ETA_CL)
  }
  V = TVV * exp(ETA_V)

The block form is appropriate when the body contains multiple statements or when several alternative branches are needed. Conditions support comparison operators (<, <=, >, >=, ==, !=) and logical operators (&&, ||, !); parentheses group sub-conditions.

Inline (ternary) form

[individual_parameters]
  CL = if (SEX == 1) TVCL * 1.5 else TVCL
  V  = TVV  * exp(ETA_V)

The inline form produces a value and can appear anywhere an expression is allowed. Both then and else branches are required.

Interaction with mu-referencing

When the assignment to a parameter is wrapped in an if block, the (ETA → THETA) relationship is no longer unconditional, so ferx skips mu-reference detection for that parameter. Unconditional assignments in the same block continue to be detected normally.

Tip: if you want mu-referencing for a covariate-adjusted parameter, keep the assignment unconditional and bury the conditional inside the covariate term (e.g. CL = TVCL * (if (WT > 70) (WT / 70)^0.75 else 1.0) * exp(ETA_CL)).

Common Parameterizations

Exponential (log-normal) random effects

The standard approach for PK parameters that must be positive:

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

Allometric scaling with covariates

[individual_parameters]
  CL = TVCL * (WT/70)^0.75 * exp(ETA_CL)
  V  = TVV  * (WT/70)^1.0  * exp(ETA_V)

Estimated covariate effects

Use additional theta parameters for covariate coefficients:

[parameters]
  theta TVCL(0.134, 0.001, 10.0)
  theta THETA_WT(0.75, 0.01, 2.0)
  theta THETA_CRCL(0.5, 0.01, 2.0)

[individual_parameters]
  CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)

Logit-normal bioavailability

Use inv_logit(logit(THETA_F) + ETA_F) to constrain bioavailability to (0, 1). The starting value for THETA_F is set directly on the (0, 1) scale — whatever fraction you specify is what the optimiser uses as the typical F:

[parameters]
  theta THETA_F(0.70, 0.001, 0.999)  # typical bioavailability = 70%
  omega ETA_F ~ 0.10                 # BSV on the logit scale

[individual_parameters]
  F = inv_logit(logit(THETA_F) + ETA_F)

When ETA_F = 0, F_i = THETA_F exactly (the logit and inv_logit cancel).
ETA_F shifts each individual's F on the logit scale, symmetrically around the typical value.
The estimated THETA_F in the output is directly interpretable as the typical bioavailability.
omega ETA_F is variance on the logit scale — not the variance of F itself.

The alternative form inv_logit(THETA_F + ETA_F) is also supported, where THETA_F is on the logit scale (e.g., logit(0.70) ≈ 0.847). This is less readable but may be useful when comparing with NONMEM models.

Automatic MU-referencing

When a line matches one of the patterns

PARAM = THETA * exp(ETA)              # multiplicative / log-normal
PARAM = THETA * <anything> * exp(ETA) # multiplicative with covariate terms
PARAM = exp(log(THETA) + ETA)         # canonical MU form
PARAM = THETA + ETA                   # additive

ferx records the (ETA → THETA) mapping and uses it to re-centre the inner-loop ETA search at each outer iteration — reproducing NONMEM / nlmixr2's MU-referencing behaviour without requiring you to write an explicit MU_i line. See the FAQ for details on which patterns are and are not detected, and for the mu_referencing = false escape hatch.

Covariate Detection

Any uppercase identifier in the expression that does not match a theta name or eta name is automatically treated as a covariate. The covariate value is read from the corresponding column in the data file.

For example, in CL = TVCL * (WT/70)^0.75 * exp(ETA_CL):

TVCL matches a theta parameter
ETA_CL matches an omega parameter
WT matches neither, so it is treated as a covariate column

PK Parameter Names

The parameter names on the left side of each assignment must map to recognized PK parameter names:

Name	PK Parameter
`CL`	Clearance
`V` or `V1`	Volume of distribution (central compartment)
`Q`	Intercompartmental clearance
`V2`	Peripheral volume
`KA`	Absorption rate constant
`F`	Bioavailability (default 1.0 if omitted)

For ODE models, the parameter names are user-defined and passed as a flat vector to the ODE right-hand side function.

Structural Model

The [structural_model] block specifies the pharmacokinetic model used to generate predictions.

Analytical PK Models

For standard compartmental models, use the pk keyword with a model function:

pk MODEL_NAME(param=VALUE, param=VALUE, ...)

Available Models

Model Function	Compartments	Route	Required Parameters
`one_cpt_iv_bolus`	1	IV bolus	`cl`, `v`
`one_cpt_oral`	1	Oral	`cl`, `v`, `ka`
`one_cpt_infusion`	1	IV infusion	`cl`, `v`
`two_cpt_iv_bolus`	2	IV bolus	`cl`, `v1`, `q`, `v2`
`two_cpt_oral`	2	Oral	`cl`, `v1`, `q`, `v2`, `ka`
`two_cpt_infusion`	2	IV infusion	`cl`, `v1`, `q`, `v2`

Examples

One-compartment oral:

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

Two-compartment IV bolus:

[structural_model]
  pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)

Two-compartment oral:

[structural_model]
  pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)

Bioavailability

For oral models, bioavailability (F) defaults to 1.0. To estimate it, define an F parameter in [individual_parameters] -- it will be automatically used by the oral PK functions.

Dose Handling

Analytical models support:

Bolus doses: Instantaneous input (default when RATE=0 in data)
Infusions: Zero-order input (when RATE>0 in data)
Steady-state: Pre-computed steady-state concentrations (when SS=1 and II>0 in data)
Dose superposition: Multiple doses are handled by summing contributions from each dose event

Numerical Stability

The analytical solutions include special handling for:

Near-equal rate constants: When absorption and elimination rates are similar (KA ~ k), L'Hopital's rule is used to avoid division by zero
Two-compartment eigenvalues: Vieta's formula is used for robust computation of alpha and beta

ODE Models

For non-standard kinetics (e.g., saturable elimination), use the ODE specification:

[structural_model]
  ode(obs_cmt=COMPARTMENT_NAME, states=[state1, state2, ...])

See ODE Models for full ODE syntax.

Error Model

The [error_model] block defines the residual error structure, specifying how observed data (DV) relates to model predictions.

Syntax

DV ~ ERROR_TYPE(SIGMA_PARAMS)

Available Error Models

Additive

DV ~ additive(SIGMA_NAME)

The residual variance is constant across all predictions:

\[ \text{Var}(DV) = \sigma^2 \]

Use when measurement error is independent of concentration (e.g., assay with fixed precision).

Proportional

DV ~ proportional(SIGMA_NAME)

The residual variance scales with the predicted value:

\[ \text{Var}(DV) = (\sigma \cdot f)^2 \]

where $ f $ is the model prediction. Use when measurement error increases with concentration (most common in PK).

Combined

DV ~ combined(SIGMA_PROP, SIGMA_ADD)

Combines proportional and additive components:

\[ \text{Var}(DV) = (\sigma_1 \cdot f)^2 + \sigma_2^2 \]

Use when both proportional and additive error sources are present. Requires two sigma parameters defined in [parameters].

Sigma scale

All sigma parameters are estimated and reported on the standard-deviation scale, not the variance scale. This is true for both proportional and additive components, and for both elements of a combined error model.

In particular:

Error model	Initial value `sigma = X` means …
`proportional`	The residual SD scales as `X · f`, i.e. CV% = `X · 100`
`additive`	The residual SD is `X` in the units of `DV` (no CV interpretation)
`combined`	First sigma is proportional (CV-style), second is additive (units of `DV`)

So sigma PROP_ERR ~ 0.1 for a proportional model is a 10% CV initial value, not 1%. Likewise, the SE on a proportional sigma is on the SD scale — multiply by 100 for an SE in CV-percentage points.

The fitted YAML emits both the SD (estimate) and the variance (variance: estimate²); for proportional components it also emits cv_pct = estimate · 100 so downstream tooling does not have to re-derive it.

Examples

Proportional error (most common):

[parameters]
  sigma PROP_ERR ~ 0.01

[error_model]
  DV ~ proportional(PROP_ERR)

Additive error:

[parameters]
  sigma ADD_ERR ~ 1.0

[error_model]
  DV ~ additive(ADD_ERR)

Combined error:

[parameters]
  sigma PROP_ERR ~ 0.1
  sigma ADD_ERR  ~ 0.5

[error_model]
  DV ~ combined(PROP_ERR, ADD_ERR)

Impact on Estimation

The error model affects:

Individual weighted residuals (IWRES): (DV - IPRED) / sqrt(Var)
Conditional weighted residuals (CWRES): Accounts for uncertainty in random effect estimates
Objective function value (OFV): The likelihood includes log(Var) terms, so the error model structure directly influences parameter estimates

ODE Models

For pharmacokinetic models without analytical solutions (e.g., saturable elimination, target-mediated drug disposition), FeRx provides an ODE solver.

Structural Model Declaration

[structural_model]
  ode(obs_cmt=OBSERVABLE_COMPARTMENT, states=[state1, state2, ...])

obs_cmt: The compartment whose concentration is observed (matched to DV)
states: List of state variable names (compartments)

ODE Equations

The [odes] block defines the right-hand side of the ODE system:

[odes]
  d/dt(state_name) = expression

Expressions can reference:

State variables by name
Individual parameters defined in [individual_parameters]
Arithmetic operators and functions (exp, log, sqrt, etc.)
Conditional logic with the same if (cond) { ... } else { ... } and inline if (cond) expr else expr syntax described in Individual Parameters. For example, you can switch between linear and saturable elimination based on the central amount:
```
[odes]
  d/dt(depot)   = -KA * depot
  if (central > KM_THRESHOLD) {
    d/dt(central) = KA * depot - VMAX * central / (KM + central)
  } else {
    d/dt(central) = KA * depot - CL_LIN * central
  }
```
Each d/dt(state) reachable from any branch counts as defined; states that aren't assigned in the firing branch this step receive a derivative of 0.

Example: Michaelis-Menten Elimination

A one-compartment oral model with saturable (Michaelis-Menten) elimination:

[parameters]
  theta TVVMAX(10.0, 0.1, 1000.0)
  theta TVKM(2.0, 0.01, 100.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  omega ETA_VMAX ~ 0.09
  omega ETA_V    ~ 0.04

  sigma PROP_ERR ~ 0.1

[individual_parameters]
  VMAX = TVVMAX * exp(ETA_VMAX)
  KM   = TVKM
  V    = TVV * exp(ETA_V)
  KA   = TVKA

[structural_model]
  ode(obs_cmt=central, states=[depot, central])

[odes]
  d/dt(depot)   = -KA * depot
  d/dt(central) = KA * depot / V - VMAX * central / (KM + central)

[error_model]
  DV ~ proportional(PROP_ERR)

Solver Details

FeRx uses a Dormand-Prince RK45 adaptive solver:

Setting	Value
Method	Explicit Runge-Kutta 4(5)
Absolute tolerance	1e-6
Relative tolerance	1e-4
Max steps	10,000
Initial step size	0.1
Minimum step size	1e-12

The solver automatically adapts step sizes based on local error estimates.

Dose Handling

Bolus doses: Applied as instantaneous state changes at dose times. The dose amount is added to the target compartment: state[cmt] += amt
Infusion doses (RATE > 0): Treated as a continuous zero-order input. The integrator's timeline is broken at the infusion's end (time + amt/rate), and +rate is added to the target compartment's derivative for every segment fully spanned by the infusion. Overlapping infusions on the same compartment sum their rates
Compartment indexing: Compartments are 1-indexed in the data file (CMT=1 corresponds to the first state in the states list)
Multiple doses: The ODE is integrated in segments between dose events, with state discontinuities at each bolus

Limitations

The observable compartment contains the amount (not concentration). Divide by volume in the ODE equations if needed
Steady-state (SS=1) is not directly supported for ODE models

Fit Options

The optional [fit_options] block configures the estimation method and optimizer settings.

Syntax

[fit_options]
  key = value

General Options

Key	Values	Default	Description
`method`	`foce`, `focei`, `saem`	`foce`	Estimation method
`maxiter`	integer	`500`	Maximum outer loop iterations
`covariance`	`true`, `false`	`true`	Compute covariance matrix and standard errors
`optimizer`	`slsqp`, `lbfgs`, `nlopt_lbfgs`, `mma`, `bfgs`, `bobyqa`, `trust_region`	`slsqp`	Optimization algorithm
`inner_maxiter`	integer	`200`	Max iterations for the inner (per-subject EBE) optimizer
`inner_tol`	float	`1e-4`	Gradient-norm convergence tolerance for the inner (per-subject EBE) optimizer. The default of `1e-4` matches the precision of typical NLME engines (NONMEM's default inner-loop SIGDIGITS is ~3, equivalent to ~`1e-3`). Tighter values (e.g. `1e-6`, `1e-8`) over-converge the EBE relative to the Sheiner–Beal linearisation error and can slow FOCEI fits by 10–15× without measurable change in the final OFV. Use a tighter value only if you're comparing post-hoc EBE values across runs at high precision.
`steihaug_max_iters`	integer	`50`	Max CG iterations for the Steihaug subproblem (only used when `optimizer = trust_region`)
`global_search`	`true`, `false`	`false`	Run NLopt CRS2-LM (Controlled Random Search with Local Mutation) as a gradient-free global pre-search before the local optimizer. CRS2-LM samples within the parameter bounds; the local optimizer (e.g. `bobyqa`, `slsqp`) starts from the best point found. Useful for poorly-identified models — when the local optimizer can land in a degenerate basin (collapsed ETA, V/Q swap, parameters at bounds) from a far-from-truth start, the global pre-search usually escapes it. Adds the pre-search budget on top of the local optimisation, but typically more efficient than running multiple full fits from scratch. Requires a full NLopt build (e.g. `brew install nlopt` or `apt install libnlopt-dev`); a clear warning is emitted if CRS2-LM is unavailable.
`global_maxeval`	integer	`200 * (n_params + 1)`	Maximum evaluations of the FOCE objective during the global pre-search. Each eval is a full inner-loop pass over all subjects, so this is the dominant cost of `global_search = true`. The default (`0` → auto) is empirically enough to escape bad basins on 10–20 parameter PK models without dominating the wall time of the subsequent local refine.
`bloq_method`	`drop`, `m3`	`drop`	How to handle rows with `CENS=1`. `m3` enables Beal's M3 likelihood (see BLOQ example).
`mu_referencing`	`true`, `false`	`true`	Re-centre inner-loop ETA estimates on the current population mean (auto-detected from `[individual_parameters]`). See the FAQ entry for details. Set `false` to reproduce pre-automatic-mu behaviour.
`iov_column`	string	—	Name of the occasion column in the dataset (e.g. `OCC`). Required when the model uses `kappa` or `block_kappa` declarations. The column must contain integer occasion indices. Case-insensitive. Only supported with `foce` / `focei` — not `saem`. See IOV documentation.
`optimizer_trace`	`true`, `false`	`false`	Write a per-iteration CSV to `/tmp/ferx_trace_<pid>_<ts>.csv`. The path is stored in `FitResult::trace_path`. Useful for diagnosing convergence problems or comparing optimizers. See Optimizer Trace.

Estimation Methods

FOCE (default)

method = foce

First-Order Conditional Estimation. Linearizes the model around the empirical Bayes estimates. Fast and reliable for most models.

FOCEI

method = focei

FOCE with Interaction. Includes the dependence of the residual error on random effects. More accurate than FOCE when the error model depends on individual predictions, but slightly slower.

SAEM

method = saem

Stochastic Approximation EM. Uses Metropolis-Hastings sampling instead of MAP optimization for random effects. More robust to local minima; recommended for complex models with many random effects.

SAEM-Specific Options

Key	Default	Description
`n_exploration`	`150`	Phase 1 iterations (step size = 1)
`n_convergence`	`250`	Phase 2 iterations (step size = 1/k)
`n_mh_steps`	`3`	Metropolis-Hastings steps per subject per iteration
`adapt_interval`	`50`	Iterations between MH step-size adaptation
`seed`	`12345`	RNG seed for reproducibility

SIR (Sampling Importance Resampling)

SIR provides non-parametric parameter uncertainty estimates as an optional post-estimation step. Requires covariance = true.

Key	Default	Description
`sir`	`false`	Enable SIR uncertainty estimation
`sir_samples`	`1000`	Number of proposal samples (M)
`sir_resamples`	`250`	Number of resampled vectors (m)
`sir_seed`	`12345`	RNG seed for reproducibility

See SIR documentation for details.

Optimizer Choices

Optimizer	Description	Recommended For
`slsqp`	Sequential Least Squares Programming (NLopt)	General use (default)
`bfgs`	Built-in BFGS quasi-Newton	When NLopt is unavailable
`lbfgs`	Limited-memory BFGS	Large parameter spaces
`nlopt_lbfgs`	NLopt L-BFGS	Alternative L-BFGS
`mma`	Method of Moving Asymptotes (NLopt)	Constrained problems
`bobyqa`	NLopt BOBYQA — derivative-free quadratic interpolation	Noisy or non-smooth objectives where FD gradients are unreliable
`trust_region`	Newton trust-region with Steihaug CG subproblem (argmin)	Well-conditioned problems where second-order curvature helps convergence; tune `steihaug_max_iters`

Notes:

bobyqa does not use gradients, so it is robust to small discontinuities in the FOCE surface caused by EBE re-estimation, but it converges more slowly than gradient-based methods on smooth problems.
trust_region uses a finite-difference Hessian of the OFV-at-fixed-EBEs. Each Hessian costs O(n²) OFV evaluations, so it is fastest when the number of packed parameters is small. Increase steihaug_max_iters when the parameter count exceeds the default of 50.

Parameter Scaling and EBE Convergence

Key	Default	Description
`scale_params`	`true`	Scale packed parameters to O(1) before passing to the optimizer. Improves conditioning when log-transformed parameters span several orders of magnitude. Set to `false` to disable (reproduces pre-PR2 behaviour).
`max_unconverged_frac`	`0.1`	Fraction of subjects (with at least `min_obs_for_convergence_check` observations) allowed to have unconverged EBEs before the outer optimizer rejects the step (returns OFV = ∞). Set to `1.0` to disable the guard.
`min_obs_for_convergence_check`	`2`	Subjects with fewer than this many observations are excluded from the `max_unconverged_frac` check (they still run normally).

Options That Don't Apply to the Selected Method

If you set an option that the chosen estimation method doesn't consume (e.g. n_convergence with method = focei, or optimizer with method = saem), fit() emits a warning listing the option, the selected method, and the keys that are available for that method. The option is ignored — the fit still proceeds.

For a chained fit (method = [saem, focei]), an option is kept if it applies to any stage in the chain, so SAEM and FOCEI keys can be mixed without warnings.

Global Search

Setting global_search = true runs a gradient-free pre-search (NLopt CRS2-LM) before the local optimizer. This helps escape local minima on challenging datasets.

The number of global evaluations is auto-scaled based on the number of parameters and observations, or can be set explicitly with global_maxeval.

Examples

Standard FOCE with defaults:

[fit_options]
  method     = foce
  maxiter    = 300
  covariance = true

FOCEI with global search:

[fit_options]
  method        = focei
  maxiter       = 500
  covariance    = true
  global_search = true

SAEM with custom settings:

[fit_options]
  method        = saem
  n_exploration = 200
  n_convergence = 300
  n_mh_steps    = 5
  seed          = 42
  covariance    = true

FOCEI with SIR uncertainty:

[fit_options]
  method        = focei
  covariance    = true
  sir           = true
  sir_samples   = 1000
  sir_resamples = 250
  sir_seed      = 42

Derivative-free BOBYQA fit:

[fit_options]
  method        = foce
  optimizer     = bobyqa
  maxiter       = 300
  inner_maxiter = 100
  inner_tol     = 1e-6

Trust-region with tuned CG subproblem:

[fit_options]
  method             = foce
  optimizer          = trust_region
  maxiter            = 200
  steihaug_max_iters = 30

FOCE with Inter-Occasion Variability:

[fit_options]
  method     = foce
  iov_column = OCC
  covariance = true

Enable optimizer trace and EBE guard:

[fit_options]
  method                        = foce
  optimizer_trace               = true
  max_unconverged_frac          = 0.5
  min_obs_for_convergence_check = 3

Optimizer Trace

When optimizer_trace = true, a CSV is written to /tmp/ferx_trace_<pid>_<ts>.csv and the path is stored in FitResult::trace_path. Each row is one outer iteration.

Column	Populated by	Description
`iter`	all	Iteration number
`method`	all	`foce`, `focei`, `gn`, `gn_hybrid`, `saem`
`phase`	gn_hybrid, saem	`focei` (polish) or `explore`/`converge`
`ofv`	all	Objective function value
`wall_ms`	all	Wall time for this iteration (ms)
`grad_norm`	BFGS, NLopt gradient-mode	Gradient norm
`step_norm`	BFGS	Step size
`inner_iter_count`	(reserved)	Reserved for future per-iteration inner-loop count; currently `NA`
`optimizer`	FOCE/FOCEI	Active NLopt algorithm
`lm_lambda`	GN	Levenberg-Marquardt damping factor
`ofv_delta`	GN	Change in OFV from previous iteration
`step_accepted`	GN	Whether the GN step was accepted
`cond_nll`	SAEM	Conditional observation NLL
`gamma`	SAEM	SAEM step-size (1 in exploration, 1/k in convergence)
`mh_accept_rate`	SAEM	Mean Metropolis-Hastings acceptance rate across subjects
`n_ebe_unconverged`	FOCE/FOCEI	Subjects whose inner optimizer did not converge
`n_ebe_fallback`	FOCE/FOCEI	Subjects that fell back to Nelder-Mead

Unused columns contain NA. The trace is buffered and flushed when the fit ends; if a run is killed mid-iteration the buffered tail may be lost.

Simulation

The optional [simulation] block defines a virtual clinical trial design for generating simulated data. This is useful for model validation and testing.

Syntax

[simulation]
  subjects = N
  dose = AMOUNT
  cmt = COMPARTMENT
  seed = SEED
  times = [t1, t2, t3, ...]

Key	Description
`subjects`	Number of virtual subjects to simulate
`dose`	Dose amount administered to each subject
`cmt`	Dosing compartment (1-indexed)
`seed`	Random seed for reproducible simulations
`times`	Observation time points

Example

[simulation]
  subjects = 100
  dose = 100
  cmt = 1
  seed = 12345
  times = [0.5, 1, 2, 4, 8, 12, 24]

This simulates 100 subjects, each receiving a dose of 100 units into compartment 1, with observations at 0.5, 1, 2, 4, 8, 12, and 24 hours.

Usage

Run a simulation-estimation study from the CLI:

ferx model.ferx --simulate

This will:

Parse the model and [simulation] block
Generate simulated data using the model's default parameters and random effects
Fit the model to the simulated data
Report parameter estimates and diagnostics

Simulation Process

For each subject, random effects are sampled from $ \eta_i \sim N(0, \Omega) $
Individual parameters are computed using the [individual_parameters] equations
Predictions are generated using the structural model
Residual error is added: $ DV = IPRED + \epsilon $, where $ \epsilon \sim N(0, V) $
Observations below 0.001 are clipped to 0.001

Covariates in Simulation

Covariates can be specified per subject in the [simulation] block. Currently, the simulated population uses default covariate values (covariates are not randomized).

Estimation Methods

FeRx implements two families of estimation methods for nonlinear mixed effects models:

FOCE / FOCEI -- First-Order Conditional Estimation (with or without interaction). The workhorse of population PK, using nested optimization to find maximum likelihood estimates.
Gauss-Newton (BHHH) -- A fast alternative that exploits the nonlinear-least-squares structure of the FOCE objective. Converges in 10-30 iterations using the outer product of per-subject gradients as an approximate Hessian. Available as pure GN or a GN+FOCEI hybrid.
SAEM -- Stochastic Approximation Expectation-Maximization. Uses MCMC sampling for random effects, providing more robust convergence on complex models.
SIR -- Sampling Importance Resampling. An optional post-estimation step that provides non-parametric parameter uncertainty estimates (95% CIs), more robust than the asymptotic covariance matrix.

Quick Comparison

Feature	FOCE/FOCEI	Gauss-Newton	SAEM
Random effect estimation	MAP (optimization)	MAP (optimization)	MCMC (sampling)
Convergence speed	Medium (~50-100 evals)	Fast (~10-30 iterations)	Slower (~400 iterations)
Local minima robustness	Can get stuck	Can get stuck	More robust
Gradient required	Yes (AD or FD)	Yes (FD, per-subject)	No (for E-step)
Stochastic	No	No	Yes
Best for	General use	Fast iteration, well-conditioned models	Complex models, many random effects

Choosing a Method

Start with FOCE for standard 1- or 2-compartment models with 2-4 random effects. It is deterministic and well-understood.

Try Gauss-Newton (gn or gn_hybrid) when you want faster convergence during model development, or when FOCE is slow to converge.

Switch to SAEM when:

FOCE fails to converge or produces implausible estimates
The model has many random effects (>4)
You suspect the FOCE solution is a local minimum
The model has complex nonlinear random effect structure

All methods produce comparable results on well-behaved models. The final OFV from SAEM and GN is computed using the FOCE approximation, so AIC/BIC values are directly comparable.

FOCE / FOCEI

First-Order Conditional Estimation (FOCE) and its interaction variant (FOCEI) are the primary estimation methods in FeRx. They use a two-level nested optimization to find maximum likelihood estimates of population parameters.

Algorithm Overview

Outer Loop (Population Parameters)

The outer loop optimizes the population parameters $ \theta $, $ \Omega $, and $ \sigma $ by minimizing the population objective function value (OFV):

\[ \text{OFV} = -2 \log L = \sum_{i=1}^{N} \text{OFV}_i \]

Parameters are internally transformed for unconstrained optimization:

Theta: Log-transformed (ensures positivity)
Omega: Cholesky factorization (ensures positive-definiteness)
Sigma: Log-transformed (ensures positivity)

Inner Loop (Empirical Bayes Estimates)

For each subject, the inner loop finds the empirical Bayes estimate (EBE) of the random effects $ \hat{\eta}_i $ by minimizing the individual negative log-likelihood:

\[ -\log p(\eta_i | y_i, \theta, \Omega, \sigma) = \frac{1}{2} \left[ \eta_i^T \Omega^{-1} \eta_i + \log|\Omega| + \sum_j \left( \frac{(y_{ij} - f_{ij})^2}{V_{ij}} + \log V_{ij} \right) \right] \]

The inner loop uses BFGS with automatic differentiation gradients, falling back to Nelder-Mead simplex if BFGS fails.

FOCE vs FOCEI

Standard FOCE

Uses linearized predictions around the EBEs:

\[ f_0 = f(\hat{\eta}) - H \hat{\eta} \]

where $ H $ is the Jacobian matrix $ \partial f / \partial \eta $. The per-subject objective is:

\[ \text{OFV}_i = (y - f_0)^T \tilde{R}^{-1} (y - f_0) + \log|\tilde{R}| \]

where $ \tilde{R} = H \Omega H^T + R(f_0) $.

FOCEI (Interaction)

Uses individual predictions directly without linearization:

\[ \text{OFV}_i = (y - \hat{f})^T V^{-1} (y - \hat{f}) + \hat{\eta}^T \Omega^{-1} \hat{\eta} + \log|\tilde{R}| \]

FOCEI is more accurate when the residual variance depends on the predicted value (proportional or combined error models), because it captures the interaction between random effects and residual error.

Optimizer Options

NLopt Algorithms (Recommended)

Algorithm	Key	Description
SLSQP	`slsqp`	Sequential Least Squares Programming. Handles bounds well. Default and recommended.
L-BFGS	`nlopt_lbfgs`	Limited-memory BFGS. Good for large parameter spaces.
MMA	`mma`	Method of Moving Asymptotes. Alternative constrained optimizer.
BOBYQA	`bobyqa`	Derivative-free trust-region via quadratic interpolation. Useful when FD gradients are unreliable (e.g. noisy FOCE surface).

Built-in Algorithms

Algorithm	Key	Description
BFGS	`bfgs`	Quasi-Newton with backtracking line search.
L-BFGS	`lbfgs`	Memory-efficient BFGS variant.

Newton Trust-Region (argmin)

Algorithm	Key	Description
Trust region	`trust_region`	Newton trust-region with Steihaug conjugate-gradient subproblem. Uses a finite-difference Hessian of the OFV-at-fixed-EBEs. Tune the CG budget with `steihaug_max_iters` (default 50).

Global Search

Enable gradient-free pre-search to help escape local minima:

[fit_options]
  method        = foce
  global_search = true
  global_maxeval = 2000

The pre-search uses NLopt's CRS2-LM algorithm (Controlled Random Search) to explore the parameter space before handing off to the local optimizer. The number of evaluations auto-scales with model complexity if global_maxeval is not set.

Covariance Step

When covariance = true, FeRx computes the variance-covariance matrix of the parameter estimates using a finite-difference Hessian at the converged solution. This provides:

Standard errors (SE) for all parameters
Relative standard errors (%RSE) for assessing estimation precision
Omega SEs via delta method from the Cholesky parameterization

Convergence

The outer loop terminates when any of:

The gradient norm falls below outer_gtol (default 1e-6)
The maximum number of iterations (maxiter) is reached
The optimizer reports convergence (NLopt XtolReached or FtolReached)

The inner loop terminates when the gradient norm falls below inner_tol (default 1e-8) or inner_maxiter (default 200) iterations are reached.

Warm Starting

The inner loop warm-starts EBE estimation from the previous outer iteration's EBEs. This significantly reduces computation time, especially in later iterations when parameters change slowly.

Configuration Example

[fit_options]
  method     = focei
  maxiter    = 500
  covariance = true
  optimizer  = slsqp

Gauss-Newton (BHHH)

The Gauss-Newton optimizer is an alternative to the standard FOCE/FOCEI approach that exploits the nonlinear-least-squares structure of the FOCE objective. It uses the BHHH (Berndt-Hall-Hall-Hausman) approximation to the Hessian and typically converges in 10-30 iterations, compared to 100+ evaluations for gradient-based methods like SLSQP.

This approach mirrors NONMEM's modified Gauss-Newton algorithm.

Variants

Method	Key	Description
Pure Gauss-Newton	`gn`	Fast convergence for well-conditioned problems
GN + FOCEI hybrid	`gn_hybrid`	Runs GN first, then polishes with FOCEI for robustness

Algorithm Overview

BHHH Hessian Approximation

The FOCE objective has the form:

\[ \text{OFV} = 2 \sum_{i=1}^{N} \text{NLL}_i \]

The BHHH approximation uses the outer product of per-subject gradients as an approximate Hessian:

\[ H_{\text{BHHH}} = 4 \sum_{i=1}^{N} g_i , g_i^T \]

where $ g_i = \nabla_x \text{NLL}_i $ is the gradient of the negative log-likelihood for subject $ i $ with respect to the packed parameter vector. Per-subject gradients are computed via central finite differences.

This approximation is equivalent to the Gauss-Newton Hessian for log-likelihood objectives and is guaranteed positive semi-definite.

Levenberg-Marquardt Damping

To improve stability away from the minimum, the Hessian is regularized:

\[ H_{\text{LM}} = H_{\text{BHHH}} + \lambda , \text{diag}(H_{\text{BHHH}}) \]

The damping factor $ \lambda $ is adapted during optimization:

On successful step (OFV decreases): $ \lambda \leftarrow 0.3 , \lambda $ (minimum $ 10^{-6} $)
On rejected step (OFV increases): $ \lambda \leftarrow 10 , \lambda $
If $ \lambda > 10^6 $, the algorithm stops

The Newton step is obtained by solving:

\[ H_{\text{LM}} , \delta = -\nabla \text{OFV} \]

via Cholesky decomposition. If the Cholesky fails (singular Hessian), $ \lambda $ is increased and the iteration is retried.

Line Search

Each step uses backtracking line search (up to 15 halvings). At each trial point, EBEs are re-estimated via the inner loop (warm-started from the previous EBEs), ensuring the FOCE objective is evaluated consistently.

Convergence

The algorithm converges when:

Relative OFV change $ < 10^{-6} $ for at least 3 iterations, or
Maximum iterations are reached

Hybrid Mode (GN + FOCEI)

When method = gn_hybrid, the algorithm runs in two phases:

GN phase: Run Gauss-Newton iterations to quickly find the basin of the minimum
FOCEI polish: Run up to 100 iterations of standard FOCEI optimization (via NLopt SLSQP) warm-started from the GN result

The final result is whichever phase produced the lower OFV. This combines the fast convergence of Gauss-Newton with the refined accuracy of FOCEI.

Configuration

[fit_options]
  method     = gn          # or gn_hybrid
  maxiter    = 100         # max GN iterations
  covariance = true

The initial Levenberg-Marquardt damping factor defaults to 0.01 and adapts automatically.

When to Use

Use gn when:

You want fast convergence and the model is well-conditioned
You are iterating on model development and need quick feedback

Use gn_hybrid when:

You want robustness against local minima
The model is complex (many parameters or random effects)
You want the speed of GN with a safety net

Stick with foce/focei when:

The model has convergence issues with GN (rare)
You need exact compatibility with standard FOCE/FOCEI results

SAEM

Stochastic Approximation Expectation-Maximization (SAEM) is an alternative estimation method that uses MCMC sampling for random effects instead of MAP optimization. It is more robust to local minima and can handle complex random effect structures.

Algorithm Overview

SAEM replaces the deterministic inner loop of FOCE with stochastic sampling, following the Monolix convention with a two-phase step-size schedule.

References

Delyon, Lavielle, Moulines (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, 94--128.
Kuhn & Lavielle (2004). Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: Probability and Statistics 8:115--131.

Two-Phase Schedule

Phase 1: Exploration (iterations 1 to K1)

Step size $ \gamma_k = 1 $. The algorithm explores the parameter space rapidly, with the sufficient statistics being fully replaced each iteration. This allows fast movement toward the basin of the MLE.

Default: 150 iterations.

Phase 2: Convergence (iterations K1+1 to K1+K2)

Step size $ \gamma_k = 1/(k - K_1) $. The algorithm performs a decreasing-weight average, which guarantees almost-sure convergence to the MLE under regularity conditions.

Default: 250 iterations.

Per-Iteration Steps

Each SAEM iteration consists of:

1. E-Step: Metropolis-Hastings Sampling

For each subject, run n_mh_steps Metropolis-Hastings iterations to sample from the conditional distribution of random effects:

\[ p(\eta_i | y_i, \theta, \Omega, \sigma) \]

Proposal: symmetric random walk in deviation space, $ \eta_{\text{prop}} = \eta_{\text{current}} + \delta_i \cdot L \cdot z $, with $ z \sim N(0, I) $ and $ L = \text{chol}(\Omega) $. The schedule is identical across both phases — only the SA step size $ \gamma_k $ changes between exploration and convergence.

The MH kernel is symmetric in $ \eta $, so the proposal density cancels and the acceptance log-ratio is the difference of individual_nll values, which encodes the prior $ N(0, \Omega) $ plus the observation likelihood.

Acceptance: Accept with probability $ \min(1, \exp(\text{NLL}{\text{current}} - \text{NLL}{\text{prop}})) $.

The MH sampling is parallelized across subjects using Rayon.

2. Stochastic Approximation Update

Update the sufficient statistic for $ \Omega $:

\[ S_2 \leftarrow (1 - \gamma_k) \cdot S_2 + \gamma_k \cdot \frac{1}{N} \sum_{i=1}^{N} \eta_i \eta_i^T \]

3. M-Step for Omega (Closed Form)

\[ \Omega_k = S_2 \]

with structurally-zero entries (cross-block off-diagonals, standalone-vs-block off-diagonals, and all off-diagonals in a fully-diagonal Ω) zeroed out — the SA accumulator $ (1/N) \sum_i \eta_i \eta_i^T $ is dense by construction, but the model declares which entries are free parameters. Without this projection the chain feeds spurious sampling correlations into the next iteration's MH proposal Cholesky and Ω drifts toward rank-deficiency.

4. M-Step for Theta and Sigma (Optimization)

Minimize the conditional observation negative log-likelihood with ETAs held fixed:

\[ \sum_{i=1}^{N} \sum_{j=1}^{n_i} \left[ \frac{1}{2} \log V_{ij} + \frac{1}{2} \frac{(y_{ij} - f_{ij})^2}{V_{ij}} \right] \]

When mu_referencing = true (the default), ferx detects lognormal parameters from the [individual_parameters] block and applies the closed-form EM update for those thetas instead of running NLopt:

\[ \log \theta_j \leftarrow \log \theta_j + \gamma_k \cdot \overline{\eta_j} \]

where $ \overline{\eta_j} = (1/N) \sum_i \eta_{i,j} $ is the empirical mean of the post-MH random effects for the eta paired with $ \theta_j $. For a log-mu-referenced model where $ \log P_i = \log \theta + \eta_i $ with $ \eta_i \sim N(0, \omega^2) $, this is exactly the M-step that maximises the complete-data log-likelihood, scaled by the SA step size $ \gamma_k $. After the update the etas are re-centred by the same shift so they remain deviations from the new $ \log \theta_j $.

NLopt still runs for any remaining thetas (non-mu-referenced) and for sigma — the closed-form-updated thetas are pinned at their new values for the NLopt call.

When mu_referencing = false, the full NLopt M-step runs for all thetas as before.

The number of NLopt evaluations saved is stored in FitResult::saem_mu_ref_m_step_evals_saved, accumulated across SAEM iterations as 2 × mstep_maxiter × n_mu_ref_pairs per outer step (one finite-difference probe pair per pinned mu-ref dimension, capped at mstep_maxiter NLopt gradient requests). The field is None when mu-referencing is off or method ≠ SAEM.

5. Adaptive MH Step Sizes

Every adapt_interval iterations, the per-subject step sizes $ \delta_i $ are adjusted:

If acceptance rate > 40%: increase $ \delta_i $ by 10% (up to 5.0)
If acceptance rate < 40%: decrease $ \delta_i $ by 10% (down to 0.01)

This targets an acceptance rate around 40%, balancing exploration and mixing.

Post-SAEM Finalization

After the SAEM iterations complete:

EBE Refinement: Run the standard FOCE inner loop (BFGS optimization) warm-started from the SAEM ETAs to obtain final empirical Bayes estimates
FOCE OFV: Compute the objective function using the FOCE/Laplace approximation, so AIC and BIC are directly comparable with FOCE results
Covariance Step: Optionally compute standard errors via finite-difference Hessian (same method as FOCE)
Diagnostics: Compute PRED, IPRED, CWRES, IWRES for each subject

Configuration

[fit_options]
  method        = saem
  n_exploration = 150      # Phase 1 iterations
  n_convergence = 250      # Phase 2 iterations
  n_mh_steps    = 3        # MH steps per subject per iteration
  adapt_interval = 50      # Step-size adaptation frequency
  seed          = 12345    # RNG seed for reproducibility
  covariance    = true     # Compute standard errors

Tuning Guide

Not Converging

Increase n_exploration (e.g., 300) to give more time for basin finding
Increase n_convergence (e.g., 500) for a longer averaging window
Increase n_mh_steps (e.g., 5-10) for better mixing in the E-step

Slow Convergence

Decrease n_exploration and n_convergence if parameters stabilize early
Use adapt_interval = 25 for faster step-size adaptation

Reproducibility

Always set seed for reproducible results
Different seeds will produce slightly different estimates due to the stochastic nature of the algorithm

Output

The SAEM iteration progress is printed to stderr:

SAEM: 10 subjects, 3 ETAs, 400 total iter (150 explore + 250 converge)
  SAEM iter    1/400 [explore] gamma=1.000  condNLL=95.244
  SAEM iter   50/400 [explore] gamma=1.000  condNLL=56.705
  SAEM iter  150/400 [explore] gamma=1.000  condNLL=46.071
  SAEM iter  200/400 [converge] gamma=0.020  condNLL=36.799
  SAEM iter  400/400 [converge] gamma=0.004  condNLL=38.096
SAEM iterations complete. Computing final EBEs and OFV...

The condNLL is the conditional observation negative log-likelihood (not the final OFV). It should generally decrease during the exploration phase and stabilize during convergence.

Sampling Importance Resampling (SIR)

SIR is an optional post-estimation step that provides non-parametric parameter uncertainty estimates. It produces 95% confidence intervals that are more robust than the asymptotic covariance matrix, particularly for models with:

Non-normal parameter distributions
Boundary estimates (parameters near constraints)
Small datasets where asymptotic assumptions may not hold

How It Works

SIR uses the maximum likelihood estimates and their covariance matrix as a proposal distribution, then reweights samples based on the actual likelihood:

Sample: Draw M parameter vectors from a multivariate normal distribution centered on the ML estimates, using the estimation covariance matrix
Importance weighting: For each sample, compute the objective function value (OFV) and calculate an importance weight based on the ratio of the true likelihood to the proposal density
Resample: Draw m vectors (with replacement) proportional to the importance weights

The resampled vectors approximate the true parameter uncertainty distribution. Confidence intervals are derived from their empirical percentiles.

Enabling SIR

Add sir = true to the [fit_options] block. The covariance step must also be enabled (it provides the proposal distribution):

[fit_options]
  method     = focei
  covariance = true
  sir        = true

Options

Key	Default	Description
`sir`	`false`	Enable/disable SIR
`sir_samples`	`1000`	Number of proposal samples (M). Higher values give more reliable weights but take longer
`sir_resamples`	`250`	Number of resampled vectors (m). Must be less than `sir_samples`
`sir_seed`	`12345`	RNG seed for reproducibility

Output

SIR adds the following to the estimation output:

95% CI for each theta, omega, and sigma parameter (2.5th and 97.5th percentiles)
Effective sample size (ESS): a diagnostic indicating how well the proposal distribution matches the true uncertainty. ESS close to M indicates a good match; ESS much less than m suggests the proposal is a poor fit

Diagnostics

The effective sample size (ESS) is the primary diagnostic:

ESS > m (resamples): excellent — the proposal distribution is well-matched
ESS between 100 and m: adequate for most purposes
ESS < 100: the proposal may be a poor fit; consider a different estimation method or increasing sir_samples

Computational Cost

SIR evaluates the inner loop (EBE optimization) for each of the M proposal samples. With the default M=1000, this is roughly 3-10x the cost of the estimation step itself. The computation is parallelized across samples and warm-started from the ML EBEs to minimize runtime.

The resampling step itself is negligible.

Example

[fit_options]
  method        = focei
  covariance    = true
  sir           = true
  sir_samples   = 1000
  sir_resamples = 250
  sir_seed      = 42

Reference

Dosne A-G, Bergstrand M, Karlsson MO. "Improving the estimation of parameter uncertainty distributions in nonlinear mixed effects models using sampling importance resampling." J Pharmacokinet Pharmacodyn. 2017;44(6):539-562. doi:10.1007/s10928-017-9542-0

Inter-Occasion Variability (IOV)

Inter-Occasion Variability (IOV) models the fact that a subject's PK parameters may shift between occasions (treatment periods, study visits, dosing intervals). Unlike between-subject variability (BSV), which is fixed for a subject across the whole study, IOV is a random effect that is re-drawn for each occasion.

Concepts

Occasions

An occasion is a distinct time window during which the same kappa value applies. Occasions are identified by an integer column in the dataset (e.g. OCC). All rows — dose records and observation records — that share the same occasion index belong to one occasion.

Kappa parameters

Kappa (κ) is the IOV random effect, analogous to eta (η) for BSV. Each kappa is drawn independently per occasion:

\[ \kappa_{ik} \sim \mathcal{N}(0, \Omega_\text{IOV}) \]

where i indexes subjects and k indexes occasions. The IOV omega matrix Ω_IOV is estimated alongside the BSV omega.

Individual parameters

Kappas enter the individual-parameter expressions in the same way as etas:

CL = TVCL * exp(ETA_CL + KAPPA_CL)

At each occasion k, the effective individual CL is TVCL * exp(ETA_CL_i + KAPPA_CL_ik). The BSV eta captures stable between-subject differences; the kappa captures occasion-to-occasion fluctuation within a subject.

Option A: Diagonal IOV (independent kappas)

Each kappa is declared independently, giving a diagonal Ω_IOV:

[parameters]
  ...
  kappa KAPPA_CL ~ 0.05
  kappa KAPPA_V  ~ 0.03

The two kappas are uncorrelated across occasions. This is the most common formulation and corresponds to NONMEM's IOV Option A.

Use FIX to hold a kappa variance constant during estimation:

kappa KAPPA_CL ~ 0.05 FIX

Option B: Block IOV (correlated kappas)

When occasion effects on different parameters are expected to covary, use block_kappa:

[parameters]
  ...
  block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]

Values are the lower triangle of Ω_IOV: Var(KAPPA_CL), Cov(KAPPA_CL, KAPPA_V), Var(KAPPA_V). This mirrors the block_omega syntax for BSV.

Mixed Option A + B

kappa and block_kappa declarations can be combined freely — uncorrelated kappas can coexist with a correlated block:

block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]
kappa KAPPA_KA ~ 0.10

A name may not appear in both a kappa and a block_kappa line.

Model File Setup

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09          # BSV
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  kappa KAPPA_CL ~ 0.05        # IOV (Option A)

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL + KAPPA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  iov_column = OCC

See examples/warfarin_iov.ferx for a complete runnable example.

Algorithm Details

Inner loop

When a model has kappa declarations and the subject has occasion labels, the inner optimizer runs find_ebe_iov instead of the standard find_ebe. It jointly optimizes over:

\[ p = [\underbrace{\eta_1, \ldots, \eta_{n_\eta}}{\text{BSV}},\ \underbrace{\kappa{1,1}, \ldots, \kappa_{1,n_\kappa}}{\text{occasion 1}},\ \ldots,\ \underbrace{\kappa{K,1}, \ldots, \kappa_{K,n_\kappa}}_{\text{occasion K}}] \]

The joint negative log-posterior is:

\[ -\log p(p \mid y_i) = \frac{1}{2}\left[ \eta^T \Omega^{-1} \eta + \log|\Omega|

\sum_{k=1}^{K} \kappa_k^T \Omega_\text{IOV}^{-1} \kappa_k
K \log|\Omega_\text{IOV}|
\sum_{j} \left(\frac{(y_{ij} - f_{ij}^{(k_j)})^2}{V_{ij}} + \log V_{ij}\right) \right] \]

where $ k_j $ is the occasion of observation j and $ f_{ij}^{(k)} = f(\theta, \eta_i, \kappa_{ik}) $.

Optimization uses BFGS; the gradient is computed by finite differences (no AD path for IOV).

Outer loop

The IOV omega is packed after the sigma block in the optimizer vector. For Option A (diagonal Ω_IOV) only the log-diagonal entries are appended; for Option B the full Cholesky lower triangle is appended (log-diagonals plus off-diagonals as-is), mirroring the BSV omega packing:

\[ x = [\log\theta,\ \text{chol}(\Omega_\text{BSV}),\ \log\sigma,\ \text{chol}(\Omega_\text{IOV})] \]

The per-subject FOCE objective uses a block-diagonal omega that stacks the BSV block with K copies of Ω_IOV:

\[ \Omega_\text{full} = \text{blockdiag}(\Omega_\text{BSV},\ \Omega_\text{IOV},\ \ldots,\ \Omega_\text{IOV}) \]

The H-matrix (Jacobian ∂f/∂η) covers BSV etas only — kappa columns are not included in the FOCE linearization.

Covariance step

When covariance = true, standard errors for the IOV omega diagonal (or its Cholesky elements) are computed via the finite-difference Hessian at convergence, using the same procedure as for BSV omega.

Output

The following IOV-related fields are populated on FitResult and printed by print_results(). The fit YAML gains an omega_iov: section listing the kappa variances and (for Option B) the off-diagonal covariances/correlations.

Field	Description
`omega_iov`	Estimated IOV covariance matrix (`Option<DMatrix<f64>>`)
`kappa_names`	Names of kappa parameters in declaration order
`kappa_fixed`	Per-kappa `FIX` flags
`se_kappa`	Standard errors for IOV omega elements (requires `covariance = true`)
`ebe_kappas`	Per-subject, per-occasion kappa EBEs. `ebe_kappas[i][k]` is the kappa vector for subject i, occasion k

When the dataset has occasion labels the sdtab CSV gains an OCC column (one row per observation, the subject's occasion index for that row).

Note: shrinkage_kappa exists as a placeholder field on FitResult but is not yet computed — it is always returned as an empty Vec. Use ebe_kappas directly if you need a manual shrinkage diagnostic.

Limitations

SAEM is not supported with IOV models — an error is returned directing you to use foce or focei.
Automatic differentiation (AD) is not used for IOV inner-loop gradients; finite differences are used instead. For large models this is slower than the AD path used for BSV-only models.
The occasion column must contain non-negative integers. Non-integer or negative values are silently treated as missing (OCC = 0) and a single summary warning is emitted.
Per-kappa shrinkage is not yet computed (see Output note above).
Kappa EBEs are not yet emitted as columns in the sdtab CSV; access them via FitResult.ebe_kappas.

Comparison to NONMEM

Concept	NONMEM	FeRx
Diagonal IOV	`OMEGA BLOCK(n) SAME` with one kappa per block	`kappa NAME ~ variance` (Option A)
Correlated IOV	`OMEGA BLOCK(n)` shared across occasions	`block_kappa (N1, N2) = [...]` (Option B)
Occasion column	`OCC` in `$INPUT`; referenced in `$PK`	`iov_column = OCC` in `[fit_options]`
Kappa in PK	`ETA(n)` with `SAME` block	`KAPPA_xxx` in `[individual_parameters]`

Data Format

FeRx reads data in NONMEM-compatible CSV format. This is the standard format used across population PK tools.

Required Columns

Column	Type	Description
`ID`	string/numeric	Subject identifier
`TIME`	numeric	Time relative to first dose
`DV`	numeric	Dependent variable (observed concentration)

Optional Standard Columns

Column	Type	Default	Description
`EVID`	integer	0	Event ID: 0 = observation, 1 = dose, 4 = reset + dose
`AMT`	numeric	0	Dose amount (only for `EVID=1` or `EVID=4`)
`CMT`	integer	1	Compartment number (1-indexed)
`RATE`	numeric	0	Infusion rate. 0 = bolus dose
`MDV`	integer	0	Missing DV flag. 1 = DV should be ignored
`II`	numeric	0	Interdose interval for repeated dosing
`SS`	integer	0	Steady-state flag. 1 = assume steady state
`CENS`	integer	0	Censoring flag. 1 = observation is below LLOQ; `DV` carries the LLOQ value. Paired with `bloq_method = m3` in `[fit_options]` to enable likelihood-based handling — see BLOQ example.

Occasion Column (IOV)

When using Inter-Occasion Variability (IOV), add an occasion-index column to the dataset and specify its name with iov_column in [fit_options]. The column:

Contains integer occasion indices (e.g. 1, 2, 3…) — one per row
Applies to both dose rows and observation rows
Is excluded from covariate auto-detection

Example dataset with OCC column:

ID,TIME,DV,EVID,AMT,CMT,MDV,OCC
1,0,.,1,100,1,1,1
1,1,9.5,0,.,.,0,1
1,2,7.3,0,.,.,0,1
1,24,.,1,100,1,1,2
1,25,10.1,0,.,.,0,2
1,26,8.2,0,.,.,0,2

The occasion index can be any positive integer; they do not need to start at 1 or be consecutive, but a different number means a different occasion with its own kappa EBE.

See IOV documentation for full details.

Covariate Columns

Any column not in the standard set above is automatically treated as a covariate. Covariate values are:

Time-constant: The first non-missing value for each subject is used
Time-varying: If values change over time for a subject, Last Observation Carried Forward (LOCF) is applied per event (NONMEM-equivalent: [individual_parameters] is re-evaluated at each dose and observation row using that row's covariate values)

Covariate names in the data file are matched case-insensitively to names used in [individual_parameters] expressions.

Time-varying covariate scope

Time-varying covariates are supported for all analytical structural models and ODE-defined models:

1-compartment IV bolus (one_cpt_iv_bolus)
1-compartment infusion (one_cpt_infusion)
1-compartment oral (one_cpt_oral)
2-compartment IV bolus (two_cpt_iv_bolus)
2-compartment infusion (two_cpt_infusion)
2-compartment oral (two_cpt_oral)
3-compartment IV bolus (three_cpt_iv_bolus)
3-compartment infusion (three_cpt_infusion)
3-compartment oral (three_cpt_oral)
All ODE-defined models (via [odes])

For oral models, the bolus dose into compartment 1 is interpreted as the depot (NONMEM ADVAN2/ADVAN4/ADVAN12 convention) and observation read-out reads the central compartment.

The autodiff (Enzyme) gradient fast path is also event-driven for all analytical models — TV-cov subjects keep AD-accelerated gradients and an AD-accelerated H-matrix Jacobian (forward-mode), so neither the inner-loop gradient nor the per-iteration Jacobian falls back to finite-differences.

Infusion routing on the event-driven path:

IV models: central infusion (cmt=1) for all 1/2/3-cpt; peripheral infusion for 2-cpt (cmt=2) and 3-cpt (cmt=2 → periph1, cmt=3 → periph2). Steady-state amounts per channel are computed by linear superposition over the channels.
Oral models: central infusion (cmt=2) is supported; peripheral infusion is rare clinically and still panics with a clear message (tracked as a follow-up).

Event Types (EVID)

EVID	Meaning
0	Observation record. `DV` is used for estimation.
1	Dosing record. `AMT` is administered to compartment `CMT`.
2	Other event (typically a covariate-change marker). The compartment state is unchanged but the rate matrix is refreshed from this row's covariate values — matching NONMEM's `$PK runs at every record` semantic. Only meaningful when at least one covariate is time-varying; for time-constant data EVID=2 rows are skipped (would be no-ops).
4	Reset and dose. All compartment amounts are reset to zero before dosing.

Example Dataset

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,WT,CRCL
1,0,.,1,100,1,0,1,70,95
1,0.5,9.49,0,.,.,.,0,70,95
1,1,14.42,0,.,.,.,0,70,95
1,2,17.56,0,.,.,.,0,70,95
1,4,15.23,0,.,.,.,0,70,95
1,8,10.15,0,.,.,.,0,70,95
2,0,.,1,150,1,0,1,85,110
2,0.5,14.2,0,.,.,.,0,85,110
2,1,21.3,0,.,.,.,0,85,110

Key points:

Dose records (EVID=1) have MDV=1 and DV=. (missing)
Observation records (EVID=0) have MDV=0 and a valid DV
Covariates (WT, CRCL) are included as extra columns
Missing values can be represented as . or left empty

Infusion Doses

For IV infusions, set RATE to the infusion rate (amount per time unit):

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV
1,0,.,1,500,1,50,1

This administers 500 units at a rate of 50 units/hour (duration = 10 hours).

Steady-State Dosing

For steady-state simulations, set SS=1 and II to the dosing interval:

ID,TIME,DV,EVID,AMT,CMT,SS,II,MDV
1,0,.,1,100,1,1,12,1
1,0.5,25.3,0,.,.,.,.,0

This assumes the subject has reached steady state with 100 units every 12 hours before the observation at TIME=0.5.

Multiple Doses

Multiple doses are supported as separate rows:

ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,12,.,1,100,1,1
1,12.5,15.2,0,.,.,0
1,24,.,1,100,1,1
1,24.5,18.1,0,.,.,0

Column Name Case

Column names are case-insensitive. ID, Id, and id are all recognized. Covariate columns preserve their case as declared in the CSV header.

CLI Reference

The ferx command-line tool runs population PK estimation from model files and data.

Usage

ferx <model.ferx> --data <data.csv>
ferx <model.ferx> --simulate

Commands

Fit with Data

ferx model.ferx --data data.csv

Parses the model file, reads the data, and runs the estimation method specified in [fit_options] (defaults to FOCE).

Simulate and Fit

ferx model.ferx --simulate

Parses the model file, generates simulated data from the [simulation] block, and fits the model to the simulated data. Requires a [simulation] block in the model file.

Output Files

Three files are generated, named after the model file:

File	Contents
`{model}-sdtab.csv`	Per-observation diagnostics
`{model}-fit.yaml`	Parameter estimates and standard errors
`{model}-timing.txt`	Wall-clock estimation time

See Output Files for detailed format descriptions.

Console Output

Progress

The estimation progress is printed to stderr, including:

Model and data summary (subjects, observations, parameters)
Optimizer iterations with OFV values (FOCE) or condNLL values (SAEM)
Covariance step status
Final parameter table

Result Summary

A brief summary is printed to stdout:

Fit completed!
OFV: -280.1838
Elapsed: 0.496s
  TVCL = 0.132735
  TVV = 7.694842
  TVKA = 0.757498

Exit Codes

Code	Meaning
0	Success
1	Error (parse failure, data error, convergence failure)

Examples

# One-compartment oral warfarin model
ferx examples/warfarin.ferx --data data/warfarin.csv

# Two-compartment IV with FOCE
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv

# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv

# Simulation-estimation study
ferx examples/warfarin.ferx --simulate

Building

# Build the ferx binary
cargo build --release --features autodiff

# Run directly via cargo
cargo run --release --features autodiff --bin ferx -- model.ferx --data data.csv

Output Files

Each model run produces three output files.

sdtab CSV (`{model}-sdtab.csv`)

A CSV file with per-observation diagnostics, one row per observation per subject.

Columns

Column	Description
`ID`	Subject identifier
`TIME`	Observation time
`DV`	Observed value
`PRED`	Population prediction (eta = 0)
`IPRED`	Individual prediction (eta = EBE)
`CWRES`	Conditional weighted residual
`IWRES`	Individual weighted residual
`ETA1`, `ETA2`, ...	Empirical Bayes estimates of random effects
`EBE_OFV`	Each subject's contribution to the total OFV
`N_OBS`	Number of observations for the subject

Residual Definitions

IWRES (Individual Weighted Residual): \[ \text{IWRES}_j = \frac{y_j - \text{IPRED}_j}{\sqrt{V_j}} \] where $ V_j $ is the residual variance evaluated at the individual prediction.

CWRES (Conditional Weighted Residual): \[ \text{CWRES}j = \frac{y_j - f{0,j}}{\sqrt{\tilde{R}_{jj}}} \] where $ f_0 = f(\hat{\eta}) - H\hat{\eta} $ is the linearized population prediction and $ \tilde{R} = H\Omega H^T + R $ is the conditional variance.

Example

ID,TIME,DV,PRED,IPRED,CWRES,IWRES,ETA1,ETA2,ETA3
1,0.5,9.49,10.12,9.55,-0.23,-0.06,0.15,-0.08,0.32
1,1.0,14.42,14.87,14.35,0.18,0.05,0.15,-0.08,0.32

Fit YAML (`{model}-fit.yaml`)

A YAML file containing parameter estimates, standard errors, and model diagnostics.

Structure

model:
  converged: true
  method: FOCE
objective_function:
  ofv: -280.1838
  aic: -266.1838
  bic: -247.2804
data:
  n_subjects: 10
  n_observations: 110
  n_parameters: 7
theta:
  TVCL:
    estimate: 0.132735
    se: 0.014549
    rse_pct: 11.0
  TVV:
    estimate: 7.694842
    se: 0.293028
    rse_pct: 3.8
omega:
  omega_11:
    variance: 0.028584
    cv_pct: 16.9
    se: 0.006394
  omega_22:
    variance: 0.009613
    cv_pct: 9.8
    se: 0.002165
sigma:
  sigma_1:
    estimate: 0.010638
    se: 0.000788

Key Fields

ofv: Objective Function Value (-2 log-likelihood)
aic: Akaike Information Criterion (OFV + 2p)
bic: Bayesian Information Criterion (OFV + p * ln(n))
se: Standard error from the covariance step
rse_pct: Relative standard error as percentage (SE/estimate * 100)
cv_pct: Coefficient of variation for omega (sqrt(variance) * 100)

FitResult Fields (Rust API / console)

The following fields are populated on the FitResult struct returned by fit() and printed by print_results(). They are not currently written to the fit YAML — read them programmatically or from the console summary.

Shrinkage

Two shrinkage metrics are reported after every fit:

ETA shrinkage (per random effect — shrinkage_eta: Vec<f64>): \[ \text{shrinkage}_k = 1 - \frac{\text{SD}(\hat{\eta}k)}{\sqrt{\omega{kk}}} \]

A value near 1 means individual EBEs are all pulled toward zero — the data are not informative about that random effect. A value near 0 means the ETAs are spread consistent with the prior omega.

EPS shrinkage (scalar — shrinkage_eps: f64): \[ \text{shrinkage}_\varepsilon = 1 - \text{SD}(\text{IWRES}) \]

Values of NaN indicate a zero-variance omega component (ETA) or fewer than two valid residuals (EPS).

Covariance Status

covariance_status: CovarianceStatus takes one of three values:

Value	Meaning
`Computed`	Covariance step succeeded; SE values are valid
`Failed`	Hessian was singular or inversion failed; SE fields are `None`
`NotRequested`	`covariance = false` was set; SE fields are `None`

Run Record Fields

Field	Description
`model_name`	Name from the `.ferx` file (or `"Unnamed"`)
`ferx_version`	Version of ferx-nlme that produced the result
`wall_time_secs`	Wall-clock time for the complete fit (seconds)
`gradient_method_inner`	Gradient method used in the inner (EBE) loop, e.g. `FiniteDifference`
`gradient_method_outer`	Gradient method used in the outer loop, e.g. `FiniteDifference` or `AutoDiff`
`uses_ode_solver`	`true` if the model uses the ODE solver, `false` for analytical PK
`n_threads_used`	Number of Rayon threads used during estimation
`nlopt_missing_algorithms`	NLopt algorithms that were requested but unavailable in this build (empty when all available)
`covariance_n_evals_estimated`	Estimated number of OFV evaluations the covariance step will run, populated only when `run_covariance_step = true` and `n_parameters > 30`

EBE Convergence Diagnostics

Counters from the inner-loop (EBE) optimizer, useful for diagnosing problematic fits. Always 0 for SAEM (which uses MH sampling rather than EBE optimization).

Field	Description
`ebe_convergence_warnings`	Number of outer iterations in which at least one subject had an unconverged EBE
`max_unconverged_subjects`	Worst-case unconverged-subject count seen in a single outer iteration
`total_ebe_fallbacks`	Total number of times the Nelder-Mead fallback was invoked across all subjects and outer iterations

Timing File (`{model}-timing.txt`)

A single-line text file with the wall-clock estimation time:

elapsed_seconds=0.496000

This measures only the estimation step (not parsing or data reading).

Optimizer Trace CSV

When optimizer_trace = true is set in [fit_options], a CSV is written to /tmp/ferx_trace_<pid>_<ts>.csv. The path is also stored in FitResult::trace_path.

Each row is one outer iteration. See the fit-options trace table for the full column reference.

Example use in R (with the ferx package):

fit <- ferx_fit("model.ferx", "data.csv", optimizer_trace = TRUE)
trace <- read.csv(fit$trace_path)
plot(trace$iter, trace$ofv, type = "l", xlab = "Iteration", ylab = "OFV")

Examples

FeRx includes several example models in the examples/ directory with corresponding datasets in data/.

Example	Model	Route	Features
One-Compartment Oral	1-cpt	Oral	Basic PopPK, FOCE and SAEM
Two-Compartment IV	2-cpt	IV bolus	Multi-compartment dynamics
Covariates	2-cpt	Oral	Weight and renal function effects
ODE Model	1-cpt	Oral	Michaelis-Menten elimination, ODE solver
BLOQ (M3 method)	1-cpt	Oral	Likelihood-based handling of censored observations

Running Examples

# One-compartment oral (warfarin)
ferx examples/warfarin.ferx --data data/warfarin.csv

# Two-compartment IV
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv

# Covariates model
ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv

# ODE model (Michaelis-Menten)
ferx examples/mm_oral.ferx --data data/mm_oral.csv

# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv

# BLOQ (M3 method)
ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv

One-Compartment Oral (Warfarin)

This example fits a one-compartment oral pharmacokinetic model to warfarin concentration data.

Model File (`warfarin.ferx`)

# One-compartment oral PK model (warfarin)

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  maxiter    = 300
  covariance = true

Model Description

Structure: One-compartment model with first-order absorption (KA) and first-order elimination (CL/V)
Random effects: Log-normal on CL, V, and KA (exponential eta model)
Error model: Proportional residual error
Parameters:
- TVCL: Typical value of clearance (L/h)
- TVV: Typical value of volume of distribution (L)
- TVKA: Typical value of absorption rate constant (1/h)

Running

ferx examples/warfarin.ferx --data data/warfarin.csv

Expected Results

--- THETA Estimates ---
TVCL                 0.132735
TVV                  7.694842
TVKA                 0.757498

--- OMEGA Estimates (variances) ---
OMEGA(1,1) = 0.028584  (CV% = 16.9)
OMEGA(2,2) = 0.009613  (CV% = 9.8)
OMEGA(3,3) = 0.340868  (CV% = 58.4)

--- SIGMA Estimates ---
SIGMA(1) = 0.010638

SAEM Variant

The same model can be run with SAEM by changing the [fit_options]:

[fit_options]
  method        = saem
  n_exploration = 150
  n_convergence = 250
  n_mh_steps    = 3
  seed          = 12345
  covariance    = true

ferx examples/warfarin_saem.ferx --data data/warfarin.csv

Two-Compartment IV Bolus

This example fits a two-compartment IV bolus model with four random effects.

Model File (`two_cpt_iv.ferx`)

# Two-compartment IV bolus PK model

[parameters]
  theta TVCL(5.0, 0.01, 100.0)
  theta TVV1(50.0, 0.1, 1000.0)
  theta TVQ(10.0, 0.01, 200.0)
  theta TVV2(100.0, 0.1, 5000.0)

  omega ETA_CL ~ 0.09
  omega ETA_V1 ~ 0.04
  omega ETA_Q  ~ 0.04
  omega ETA_V2 ~ 0.09

  sigma PROP_ERR ~ 0.04

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V1 = TVV1 * exp(ETA_V1)
  Q  = TVQ  * exp(ETA_Q)
  V2 = TVV2 * exp(ETA_V2)

[structural_model]
  pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  maxiter    = 500
  covariance = true

Model Description

Structure: Two-compartment model with central (V1) and peripheral (V2) compartments connected by intercompartmental clearance (Q)
Route: Intravenous bolus (dose goes directly into central compartment)
Random effects: Log-normal on all four PK parameters
Parameters:
- CL: Systemic clearance (L/h)
- V1: Central volume of distribution (L)
- Q: Intercompartmental clearance (L/h)
- V2: Peripheral volume of distribution (L)

The bi-exponential concentration profile is characterized by a rapid distribution phase (alpha) followed by a slower elimination phase (beta).

Running

ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv

Notes

Two-compartment models have more parameters and may need more iterations to converge
The global_search = true option can help if convergence is difficult
Consider reducing random effects (e.g., fixing Q or V2 variability) if the model is overparameterized

Covariate Model

This example demonstrates a two-compartment oral model with body weight (WT) and creatinine clearance (CRCL) as covariates on clearance.

Model File (`two_cpt_oral_cov.ferx`)

# Two-compartment oral PK model with covariates

[parameters]
  theta TVCL(5.0, 0.01, 100.0)
  theta TVV1(50.0, 0.1, 1000.0)
  theta TVQ(10.0, 0.01, 200.0)
  theta TVV2(100.0, 0.1, 5000.0)
  theta TVKA(1.0, 0.01, 50.0)
  theta THETA_WT(0.75, 0.01, 2.0)
  theta THETA_CRCL(0.5, 0.01, 2.0)

  omega ETA_CL ~ 0.09
  omega ETA_V1 ~ 0.04
  omega ETA_Q  ~ 0.04
  omega ETA_V2 ~ 0.09
  omega ETA_KA ~ 0.25

  sigma PROP_ERR ~ 0.04

[individual_parameters]
  CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
  V1 = TVV1 * exp(ETA_V1)
  Q  = TVQ  * exp(ETA_Q)
  V2 = TVV2 * exp(ETA_V2)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = focei
  maxiter    = 500
  covariance = true

Covariate Effects

The clearance equation includes two covariate effects:

CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)

(WT/70)^THETA_WT: Allometric scaling of clearance with body weight, centered at 70 kg. THETA_WT is estimated (expected ~0.75 for CL).
(CRCL/100)^THETA_CRCL: Renal function effect on clearance, centered at 100 mL/min. THETA_CRCL is estimated.

Data Requirements

The dataset must include WT and CRCL columns:

ID,TIME,DV,EVID,AMT,CMT,MDV,WT,CRCL
1,0,.,1,100,1,1,72.5,105
1,0.5,12.3,0,.,.,0,72.5,105
1,1.0,18.7,0,.,.,0,72.5,105

Covariate columns are automatically detected -- any column not in the standard NONMEM set is treated as a covariate. The names in the data file must match those used in [individual_parameters] (case-insensitive).

Running

ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv

Notes

FOCEI is used here because the proportional error model creates an interaction between random effects and residual error
Covariate centering (dividing by 70 for weight, 100 for CRCL) improves numerical stability and makes the typical value (TVCL) interpretable as the clearance for a 70 kg patient with CRCL of 100 mL/min
The estimated covariate exponents (THETA_WT, THETA_CRCL) have standard errors that can be used to test significance

ODE Model (Michaelis-Menten)

This example demonstrates a one-compartment oral model with saturable (Michaelis-Menten) elimination, solved using the ODE integrator.

Model File (`mm_oral.ferx`)

# One-compartment oral model with Michaelis-Menten elimination

[parameters]
  theta TVVMAX(10.0, 0.1, 1000.0)
  theta TVKM(2.0, 0.01, 100.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  omega ETA_VMAX ~ 0.09
  omega ETA_V    ~ 0.04

  sigma PROP_ERR ~ 0.1

[individual_parameters]
  VMAX = TVVMAX * exp(ETA_VMAX)
  KM   = TVKM
  V    = TVV * exp(ETA_V)
  KA   = TVKA

[structural_model]
  ode(obs_cmt=central, states=[depot, central])

[odes]
  d/dt(depot)   = -KA * depot
  d/dt(central) = KA * depot / V - VMAX * central / (KM + central)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = focei
  maxiter    = 500
  covariance = true

Model Description

Why Use ODEs

Michaelis-Menten (saturable) elimination cannot be described by a simple exponential function. At high concentrations, elimination approaches a maximum rate (VMAX), leading to nonlinear pharmacokinetics. This requires numerical ODE integration.

ODE System

d/dt(depot)   = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)

depot: Absorption compartment. First-order absorption at rate KA.
central: Central compartment. Drug enters from depot and is eliminated by Michaelis-Menten kinetics.

Parameters

VMAX: Maximum elimination rate (amount/time)
KM: Michaelis-Menten constant (concentration at half-maximal rate)
V: Volume of distribution
KA: First-order absorption rate constant

Note that only VMAX and V have random effects. KM and KA are treated as population parameters (no between-subject variability).

Structural Model Declaration

[structural_model]
  ode(obs_cmt=central, states=[depot, central])

obs_cmt=central: The central compartment is observed (matched to DV in the data)
states=[depot, central]: Two state variables (compartments)

Running

ferx examples/mm_oral.ferx --data data/mm_oral.csv

Notes

ODE models are slower than analytical models because each prediction requires numerical integration
The ODE solver uses adaptive step-size control (Dormand-Prince RK45) for accuracy
For standard first-order kinetics, use the analytical one_cpt_oral model instead -- it is much faster
The central compartment in this model contains the drug amount. If you want concentration as the observed variable, divide by V in the ODE equations (as done here with KA * depot / V)

BLOQ (M3 method)

This example shows how to fit a model to data that contains observations below the assay lower limit of quantification (BLOQ) using Beal's M3 method. Instead of dropping BLOQ rows — which biases terminal-phase parameter estimates — each censored observation contributes P(y < LLOQ | θ, η) = Φ((LLOQ − f)/√V) to the likelihood.

Dataset (`data/warfarin-bloq.csv`)

This is the warfarin dataset with an added CENS column. Ten late-time observations that originally fell below an assay LLOQ of 2.0 µg/mL have been marked CENS=1 and their DV cell set to the LLOQ value:

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,CENS
1,0,.,1,100,1,0,1,0
1,0.5,5.3653,0,.,1,0,0,0
...
1,96,2.5019,0,.,1,0,0,0
1,120,2,0,.,1,0,0,1

NONMEM CENS convention: when CENS=1, the row is censored and DV carries the LLOQ value — not the true (unobserved) concentration. Rows with no CENS column, or CENS=0, are treated as ordinary quantified observations.

Model File (`examples/warfarin-bloq.ferx`)

# One-compartment oral PK model (warfarin) with M3 BLOQ likelihood.

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method       = focei
  maxiter      = 300
  covariance   = true
  bloq_method  = m3

The only two lines that differ from a plain fit are bloq_method = m3 in [fit_options] and the added CENS column in the CSV. Nothing else in the model changes.

Running

ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv

Expected Results

--- Objective Function ---
OFV:  -213.9104

--- THETA Estimates ---
TVCL                 0.126263
TVV                  7.629401
TVKA                 1.083302

--- OMEGA Estimates ---
  OMEGA(1,1) = 0.031381  (CV% = 17.7)
  OMEGA(2,2) = 0.009723  (CV% = 9.9)
  OMEGA(3,3) = 0.419327  (CV% = 64.8)

--- SIGMA Estimates ---
  SIGMA(1) = 0.010766

The diagnostic table (warfarin-bloq-sdtab.csv) gains a CENS column, and the IWRES / CWRES cells for censored rows are written as empty (a weighted Gaussian residual is undefined when the observed value is censored).

Method Notes

Activation requires both pieces: the CENS column in the data file and bloq_method = m3 in [fit_options]. Without the option, CENS=1 rows are treated as ordinary observations at the LLOQ value, which biases the fit.
FOCE is auto-promoted to FOCEI on affected subjects. Mixing linearized Gaussian residuals with non-linearized log Φ terms produces inconsistent OFVs near the LLOQ boundary, so when method = foce and a subject has any CENS=1 row, that subject is evaluated with η-interaction. A notice is written to FitResult.warnings; set method = focei explicitly to silence it.
Gauss-Newton caveat. With method = gn or gn_hybrid, the BHHH information-matrix approximation degrades as the BLOQ fraction grows (each censored row carries less Fisher information than its Gaussian counterpart). A warning is emitted; for >20% censoring prefer focei.
SAEM optimizes θ/σ with M3 in the M-step directly, so no special handling is required beyond setting bloq_method = m3.

Inter-Occasion Variability (IOV)

This example extends the one-compartment oral warfarin model to account for occasion-to-occasion variability in clearance. The complete model file is examples/warfarin_iov.ferx.

When to use IOV

Use IOV when:

Subjects have data from multiple study periods (crossover designs, multiple visits)
You expect within-subject PK variability across periods that is larger than pure residual error
NONMEM models used OMEGA BLOCK SAME syntax for one or more parameters

Dataset

The dataset must include an occasion-index column. Add an integer OCC column (or any name you choose — tell FeRx via iov_column):

ID,TIME,DV,EVID,AMT,CMT,MDV,OCC
1,0,.,1,100,1,1,1
1,1,9.49,0,.,.,0,1
1,2,14.42,0,.,.,0,1
1,3,17.56,0,.,.,0,1
1,24,.,1,100,1,1,2
1,25,10.1,0,.,.,0,2
1,26,8.2,0,.,.,0,2

Occasion 1 = first period, occasion 2 = second period. Dose records and observation records in the same period share the same OCC value.

Model File (Option A — diagonal IOV)

This is the contents of examples/warfarin_iov.ferx:

model warfarin_iov

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09      # between-subject variability in CL
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  kappa KAPPA_CL ~ 0.01    # inter-occasion variability in CL

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL + KAPPA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  iov_column = OCC
  covariance = false

The key additions compared to a standard model:

kappa KAPPA_CL ~ 0.01 — declares IOV on CL with starting variance 0.01
+ KAPPA_CL in the CL expression — kappa enters just like a BSV eta
iov_column = OCC — tells FeRx which dataset column carries occasion labels

Set covariance = true if you want standard errors on the IOV variance (the shipped example leaves it off to keep the demo fast).

Model File (Option B — correlated IOV)

If you have IOV on both CL and V and suspect they covary across occasions, use block_kappa:

[parameters]
  ...
  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL + KAPPA_CL)
  V  = TVV  * exp(ETA_V  + KAPPA_V)
  KA = TVKA * exp(ETA_KA)

The three values in [0.05, 0.01, 0.03] are the lower triangle of Ω_IOV:

0.05 = Var(KAPPA_CL)
0.01 = Cov(KAPPA_CL, KAPPA_V)
0.03 = Var(KAPPA_V)

Running the Fit

The repository does not ship a warfarin dataset with occasion labels — point --data at your own NONMEM-format CSV that includes an OCC column (see Dataset above):

ferx examples/warfarin_iov.ferx --data path/to/your_occ_data.csv

Or via the Rust API:

#![allow(unused)]
fn main() {
let result = fit_from_files("examples/warfarin_iov.ferx", "path/to/your_occ_data.csv")?;
println!("Omega IOV: {:?}", result.omega_iov);
}

Interpreting Output

The console summary gains a KAPPA (IOV) Estimates block, and the fit YAML gains an omega_iov: section.

Console output (Option A, single kappa on CL):

--- KAPPA (IOV) Estimates ---
  KAPPA_CL = 0.041200  (CV% = 20.3)  SE = 0.008000

For Option B with off-diagonal correlations a --- Correlations --- block follows the diagonal entries.

Per-subject, per-occasion kappa EBEs are returned on FitResult.ebe_kappas. They are not yet emitted as columns in the sdtab CSV — access them via the Rust API if you need per-occasion plots:

#![allow(unused)]
fn main() {
for (i, subject_kappas) in result.ebe_kappas.iter().enumerate() {
    for (k, kappa_vec) in subject_kappas.iter().enumerate() {
        println!("subject {} occasion {}: {:?}", i + 1, k + 1, kappa_vec);
    }
}
}

Note: per-kappa shrinkage is not yet computed (shrinkage_kappa is always returned empty). If you need a shrinkage diagnostic for IOV, derive it manually from ebe_kappas and omega_iov.

Tips

Start with IOV on CL only — it is the most commonly occasion-sensitive parameter. Add IOV on other parameters only when the OFV improvement justifies it.
Compare OFV between BSV-only and IOV models — the difference is a likelihood-ratio test for the added IOV variance(s). Note that for a variance parameter at the boundary (H₀: σ² = 0) the asymptotic null is a 50:50 mixture of a point mass at 0 and χ²₁, not a plain χ²₁.
Use FOCEI for proportional errors — the interaction term matters more when individual predictions vary across occasions.
SAEM is not supported with IOV — use foce or focei.

Rust API

FeRx-NLME can be used as a Rust library for embedding population PK estimation in your own applications.

Adding as a Dependency

[dependencies]
ferx-nlme = { path = "../ferx-nlme", features = ["autodiff"] }

Quick Example

use ferx_nlme::*;
use std::path::Path;

fn main() -> Result<(), String> {
    // Parse model and data
    let parsed = parse_full_model_file(Path::new("model.ferx"))?;
    let population = read_nonmem_csv(Path::new("data.csv"), None)?;

    // Build initial parameters and options
    let (init_params, options) = build_fit_inputs(&parsed)?;

    // Run estimation
    let result = fit(&parsed.model, &population, &init_params, &options)?;

    // Access results
    println!("OFV: {}", result.ofv);
    for (name, val) in result.theta_names.iter().zip(result.theta.iter()) {
        println!("  {} = {:.6}", name, val);
    }

    Ok(())
}

API Sections

Core Types -- CompiledModel, Population, FitResult, FitOptions, etc.
Fitting Functions -- fit(), fit_from_files(), run_model_with_data()
Simulation -- simulate(), simulate_with_seed(), predict()
Parsing -- parse_model_file(), parse_full_model_file(), read_nonmem_csv()

Core Types

The main types exported by ferx_nlme. All types are available via use ferx_nlme::*.

CompiledModel

A parsed and compiled model ready for estimation. Created by parse_model_file() or parse_full_model_file().

#![allow(unused)]
fn main() {
pub struct CompiledModel {
    pub name: String,
    pub pk_model: PkModel,
    pub error_model: ErrorModel,
    pub pk_param_fn: PkParamFn,     // (theta, eta, covariates) -> PkParams
    pub n_theta: usize,
    pub n_eta: usize,
    pub n_epsilon: usize,
    pub theta_names: Vec<String>,
    pub eta_names: Vec<String>,
    pub default_params: ModelParameters,
    pub tv_fn: Option<...>,          // Typical values function (for AD)
    pub pk_indices: Vec<usize>,      // Maps eta index -> PK parameter index
    pub ode_spec: Option<OdeSpec>,   // ODE specification (if ODE model)
}
}

Key fields:

pk_param_fn: Closure that maps (theta, eta, covariates) to PK parameters. Generated by the parser from [individual_parameters].
default_params: Initial parameter values from [parameters] block.
ode_spec: Present only for ODE models; None for analytical models.

Population and Subject

#![allow(unused)]
fn main() {
pub struct Population {
    pub subjects: Vec<Subject>,
    pub covariate_names: Vec<String>,
    pub dv_column: String,
}

pub struct Subject {
    pub id: String,
    pub doses: Vec<DoseEvent>,
    pub obs_times: Vec<f64>,
    pub observations: Vec<f64>,
    pub obs_cmts: Vec<usize>,
    pub covariates: HashMap<String, f64>,                  // Subject-static fallback (first values)
    pub dose_covariates: Vec<HashMap<String, f64>>,        // Per-dose LOCF snapshot (empty if no TV cov)
    pub obs_covariates: Vec<HashMap<String, f64>>,         // Per-obs LOCF snapshot (empty if no TV cov)
    pub cens: Vec<u8>,
    pub occasions: Vec<u32>,
    pub dose_occasions: Vec<u32>,
}
}

dose_covariates and obs_covariates are populated only when at least one covariate varies within the subject — the no-TV-cov fast paths use covariates directly. Helpers subject.has_tv_covariates(), subject.dose_cov(k), and subject.obs_cov(j) abstract over this.

ModelParameters

#![allow(unused)]
fn main() {
pub struct ModelParameters {
    pub theta: Vec<f64>,
    pub theta_names: Vec<String>,
    pub theta_lower: Vec<f64>,
    pub theta_upper: Vec<f64>,
    pub omega: OmegaMatrix,
    pub sigma: SigmaVector,
}
}

OmegaMatrix

Between-subject variability matrix with pre-computed Cholesky factor.

#![allow(unused)]
fn main() {
pub struct OmegaMatrix {
    pub matrix: DMatrix<f64>,       // Full omega matrix
    pub chol: DMatrix<f64>,         // Lower Cholesky factor L (Omega = L*L')
    pub eta_names: Vec<String>,
    pub diagonal: bool,             // True if omega is diagonal
}
}

Constructors:

OmegaMatrix::from_matrix(m, names, diagonal) -- From a full matrix (computes Cholesky, regularizes if needed)
OmegaMatrix::from_diagonal(variances, names) -- From diagonal variances

FitOptions

#![allow(unused)]
fn main() {
pub struct FitOptions {
    pub method: EstimationMethod,      // Foce, FoceI, or Saem
    pub outer_maxiter: usize,          // Default: 500
    pub outer_gtol: f64,               // Default: 1e-6
    pub inner_maxiter: usize,          // Default: 200
    pub inner_tol: f64,                // Default: 1e-8
    pub run_covariance_step: bool,     // Default: true
    pub interaction: bool,             // FOCEI flag
    pub verbose: bool,                 // Default: true
    pub optimizer: Optimizer,          // Default: Slsqp
    pub lbfgs_memory: usize,           // Default: 5
    pub global_search: bool,           // Default: false
    pub global_maxeval: usize,         // Default: 0 (auto)
    // SAEM-specific
    pub saem_n_exploration: usize,     // Default: 150
    pub saem_n_convergence: usize,     // Default: 250
    pub saem_n_mh_steps: usize,        // Default: 3
    pub saem_adapt_interval: usize,    // Default: 50
    pub saem_seed: Option<u64>,        // Default: None
}
}

Use FitOptions::default() for standard FOCE settings.

FitResult

#![allow(unused)]
fn main() {
pub struct FitResult {
    pub method: EstimationMethod,
    pub converged: bool,
    pub ofv: f64,
    pub aic: f64,
    pub bic: f64,
    pub theta: Vec<f64>,
    pub theta_names: Vec<String>,
    pub omega: DMatrix<f64>,
    pub sigma: Vec<f64>,
    pub covariance_matrix: Option<DMatrix<f64>>,
    pub se_theta: Option<Vec<f64>>,
    pub se_omega: Option<Vec<f64>>,
    pub se_sigma: Option<Vec<f64>>,
    pub subjects: Vec<SubjectResult>,
    pub n_obs: usize,
    pub n_subjects: usize,
    pub n_parameters: usize,
    pub n_iterations: usize,
    pub interaction: bool,
    pub warnings: Vec<String>,
}
}

SubjectResult

Per-subject diagnostics from the fit.

#![allow(unused)]
fn main() {
pub struct SubjectResult {
    pub id: String,
    pub eta: DVector<f64>,
    pub ipred: Vec<f64>,    // Individual predictions
    pub pred: Vec<f64>,     // Population predictions
    pub iwres: Vec<f64>,    // Individual weighted residuals
    pub cwres: Vec<f64>,    // Conditional weighted residuals
    pub ofv_contribution: f64,
}
}

Enums

#![allow(unused)]
fn main() {
pub enum EstimationMethod { Foce, FoceI, Saem }
pub enum PkModel { OneCptIvBolus, OneCptOral, OneCptInfusion,
                   TwoCptIvBolus, TwoCptOral, TwoCptInfusion }
pub enum ErrorModel { Additive, Proportional, Combined }
pub enum Optimizer { Bfgs, Lbfgs, Slsqp, NloptLbfgs, Mma }
}

DoseEvent

#![allow(unused)]
fn main() {
pub struct DoseEvent {
    pub time: f64,
    pub amt: f64,
    pub cmt: usize,
    pub rate: f64,
    pub duration: f64,
    pub ss: bool,
    pub ii: f64,
}
}

Fitting Functions

`fit()`

The primary estimation entry point. Runs FOCE, FOCEI, or SAEM depending on options.method.

#![allow(unused)]
fn main() {
pub fn fit(
    model: &CompiledModel,
    population: &Population,
    init_params: &ModelParameters,
    options: &FitOptions,
) -> Result<FitResult, String>
}

Parameters:

model: Compiled model from parse_model_file() or parse_full_model_file()
population: Population data from read_nonmem_csv()
init_params: Initial parameter values
options: Estimation configuration

Returns: FitResult with parameter estimates, standard errors, and per-subject diagnostics.

Example:

#![allow(unused)]
fn main() {
let model = parse_model_file(Path::new("model.ferx"))?;
let population = read_nonmem_csv(Path::new("data.csv"), None)?;
let options = FitOptions::default();

let result = fit(&model, &population, &model.default_params, &options)?;
println!("OFV: {:.4}", result.ofv);
}

`fit_from_files()`

Convenience wrapper that handles parsing and data reading.

#![allow(unused)]
fn main() {
pub fn fit_from_files(
    model_path: &str,
    data_path: &str,
    covariate_columns: Option<&[&str]>,
    options: Option<FitOptions>,
) -> Result<FitResult, String>
}

Example:

#![allow(unused)]
fn main() {
let result = fit_from_files(
    "model.ferx",
    "data.csv",
    None,          // Auto-detect covariates
    None,          // Default options
)?;
}

`run_model_with_data()`

Full pipeline: parse model file, read data, fit. Returns both the fit result and the population.

#![allow(unused)]
fn main() {
pub fn run_model_with_data(
    model_path: &str,
    data_path: &str,
) -> Result<(FitResult, Population), String>
}

Uses the [fit_options] from the model file.

`run_model_simulate()`

Simulation-estimation: parse model, generate data from [simulation] block, fit.

#![allow(unused)]
fn main() {
pub fn run_model_simulate(
    model_path: &str,
) -> Result<(FitResult, Population), String>
}

Requires a [simulation] block in the model file.

`build_fit_inputs()`

Extract initial parameters and fit options from a parsed model, separating parsing from estimation for timing purposes.

#![allow(unused)]
fn main() {
pub fn build_fit_inputs(
    parsed: &ParsedModel,
) -> Result<(ModelParameters, FitOptions), String>
}

Example:

#![allow(unused)]
fn main() {
let parsed = parse_full_model_file(Path::new("model.ferx"))?;
let (init_params, options) = build_fit_inputs(&parsed)?;

let population = read_nonmem_csv(Path::new("data.csv"), None)?;
let result = fit(&parsed.model, &population, &init_params, &options)?;
}

Simulation

`simulate()`

Generate simulated observations from a model with random effects and residual error.

#![allow(unused)]
fn main() {
pub fn simulate(
    model: &CompiledModel,
    population: &Population,
    params: &ModelParameters,
    n_sim: usize,
) -> Vec<SimulationResult>
}

Parameters:

model: Compiled model
population: Template population (dose events and observation times are used; DV values are ignored)
params: True parameter values for simulation
n_sim: Number of simulation replicates

Returns: Vector of SimulationResult, one per observation per subject per replicate.

Example:

#![allow(unused)]
fn main() {
let model = parse_model_file(Path::new("model.ferx"))?;
let population = read_nonmem_csv(Path::new("data.csv"), None)?;

// Simulate 1000 replicates
let sims = simulate(&model, &population, &model.default_params, 1000);

for sim in &sims[..5] {
    println!("Sim {}, ID {}, TIME {}, IPRED {:.3}, DV {:.3}",
             sim.sim, sim.id, sim.time, sim.ipred, sim.dv_sim);
}
}

`simulate_with_seed()`

Same as simulate() but with a fixed random seed for reproducibility.

#![allow(unused)]
fn main() {
pub fn simulate_with_seed(
    model: &CompiledModel,
    population: &Population,
    params: &ModelParameters,
    n_sim: usize,
    seed: u64,
) -> Vec<SimulationResult>
}

`predict()`

Population predictions without random effects (eta = 0). No simulation noise is added.

#![allow(unused)]
fn main() {
pub fn predict(
    model: &CompiledModel,
    population: &Population,
    params: &ModelParameters,
) -> Vec<PredictionResult>
}

Returns: Vector of PredictionResult with population-level predictions.

Example:

#![allow(unused)]
fn main() {
let preds = predict(&model, &population, &model.default_params);
for p in &preds {
    println!("ID {}, TIME {}, PRED {:.3}", p.id, p.time, p.pred);
}
}

Result Types

#![allow(unused)]
fn main() {
pub struct SimulationResult {
    pub sim: usize,     // Replicate number (1-indexed)
    pub id: String,     // Subject ID
    pub time: f64,      // Observation time
    pub ipred: f64,     // Individual prediction (no residual error)
    pub dv_sim: f64,    // Simulated observation (with residual error)
}

pub struct PredictionResult {
    pub id: String,
    pub time: f64,
    pub pred: f64,      // Population prediction (eta = 0)
}
}

Simulation Process

For each replicate and each subject:

Sample random effects: $ \eta_i \sim N(0, \Omega) $ using the Cholesky factor $ L $: $ \eta = L \cdot z $, where $ z \sim N(0, I) $
Compute individual PK parameters via pk_param_fn(theta, eta, covariates)
Generate predictions using the structural model
Add residual error: $ DV = IPRED + \sqrt{V} \cdot \epsilon $, where $ \epsilon \sim N(0, 1) $ and $ V $ is the residual variance from the error model

Parsing

Model Parsing

`parse_model_file()`

Parse a .ferx file into a CompiledModel. Only the core model blocks are processed ([parameters], [individual_parameters], [structural_model], [error_model]).

#![allow(unused)]
fn main() {
pub fn parse_model_file(path: &Path) -> Result<CompiledModel, String>
}

`parse_model_string()`

Parse a model from a string instead of a file.

#![allow(unused)]
fn main() {
pub fn parse_model_string(content: &str) -> Result<CompiledModel, String>
}

Example:

#![allow(unused)]
fn main() {
let model_str = r#"
[parameters]
  theta TVCL(0.1, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  omega ETA_CL ~ 0.09
  sigma ADD_ERR ~ 1.0

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV

[structural_model]
  pk one_cpt_iv_bolus(cl=CL, v=V)

[error_model]
  DV ~ additive(ADD_ERR)
"#;

let model = parse_model_string(model_str)?;
}

`parse_full_model_file()`

Parse a complete model file including [fit_options] and [simulation] blocks.

#![allow(unused)]
fn main() {
pub fn parse_full_model_file(path: &Path) -> Result<ParsedModel, String>
}

Returns a ParsedModel which contains:

#![allow(unused)]
fn main() {
pub struct ParsedModel {
    pub model: CompiledModel,
    pub simulation: Option<SimulationSpec>,
    pub fit_options: FitOptions,
}
}

Data Parsing

`read_nonmem_csv()`

Read a NONMEM-format CSV file into a Population.

#![allow(unused)]
fn main() {
pub fn read_nonmem_csv(
    path: &Path,
    covariate_columns: Option<&[&str]>,
) -> Result<Population, String>
}

Parameters:

path: Path to the CSV file
covariate_columns: Optional list of covariate column names. If None, all non-standard columns are auto-detected as covariates.

Example:

#![allow(unused)]
fn main() {
// Auto-detect covariates
let pop = read_nonmem_csv(Path::new("data.csv"), None)?;

// Explicit covariate list
let pop = read_nonmem_csv(
    Path::new("data.csv"),
    Some(&["WT", "CRCL", "AGE"]),
)?;

println!("{} subjects, {} observations",
         pop.subjects.len(), pop.n_obs());
}

Data Processing Details

Column names are matched case-insensitively
Standard NONMEM columns (ID, TIME, DV, EVID, AMT, CMT, RATE, MDV, II, SS) are recognized automatically
Missing values (., empty string) are handled appropriately
Rows with EVID=1 are treated as dose events
Rows with EVID=0 and MDV=0 are treated as observations
Time-constant covariates use the first non-missing value per subject
Time-varying covariates use Last Observation Carried Forward (LOCF) per event — [individual_parameters] is re-evaluated at each dose and observation row using that row's covariate values (NONMEM-equivalent semantics). Currently supported on 1- and 2-compartment IV bolus / infusion models and all ODE-defined models; oral and 3-compartment models fall back to a single first-row snapshot

Frequently Asked Questions

Do I need to use MU-referencing in my model definitions, like in NONMEM / nlmixr2?

No. You do not need to write an explicit MU_i intermediate variable — ferx detects the same structure automatically from the right-hand side of each [individual_parameters] line.

In NONMEM and nlmixr2, MU-referencing is a convention in which each random effect ETA(i) is linearly associated with a single MU_i term (typically MU_i = LOG(THETA(i))), so individual parameters look like:

MU_1 = LOG(THETA(1))
CL   = EXP(MU_1 + ETA(1))

This structure is required by NONMEM's SAEM implementation, whose conjugate-Gibbs E-step is only valid when the MU_i → ETA(i) relationship is strictly linear (typically on a log scale). Deviating from it — for example writing CL = THETA(1) * EXP(ETA(1)) without going through an intermediate MU_1 — causes NONMEM SAEM to reject the model or silently produce biased estimates.

How ferx handles it

In ferx, you simply write the individual-parameter line in the form that makes sense for your model. The parser inspects each line and records which THETA acts as the "anchor" for each ETA, then uses that information to re-centre the estimation search at every outer iteration. The following patterns are detected automatically:

CL = TVCL * exp(ETA_CL)              # multiplicative exp — most common
CL = exp(log(TVCL) + ETA_CL)         # canonical MU form
CL = TVCL * (WT/70)^0.75 * exp(ETA_CL)   # with covariate adjustment
CL = TVCL + ETA_CL                   # additive eta (linear shift)

The parser records ETA_CL → (TVCL, log_transformed) and at each outer step computes the shift

mu[i] = log(theta[i])   # for multiplicative/exp patterns
mu[i] = theta[i]        # for additive patterns

The inner optimizer (and the SAEM exploration-phase MH proposal) is then centred on this mu shift rather than on eta = 0. This matches the convergence behaviour that NONMEM / nlmixr2 get from an explicit MU_i = LOG(THETA(i)) block, without requiring you to write it.

Patterns that ferx does not auto-detect (and therefore falls back to the plain eta = 0 start) include:

Parameters with two or more free thetas in the product (CL = TVCL * TVSCALE * exp(ETA_CL)) — ambiguous anchor.
Compound eta expressions inside the exp (CL = TVCL * exp(ETA_CL + ETA_OCC)).
Non-standard transforms the parser doesn't recognise.

Mu-referencing being absent is not an error — it just means that eta for that parameter is initialised to zero at each outer step, same as the pre-automatic behaviour. When a fit runs, ferx records the list of auto-detected mu-referenced ETAs in FitResult.warnings (as a line starting with mu-ref: …) so you can confirm what the parser picked up.

Why it matters

Mu-referencing mostly affects convergence speed and robustness, not the final MLE:

FOCE / FOCEI: each outer step re-optimises ETA by BFGS. With a mu-shift, the warm-started BFGS sees a much better starting point when THETA moves away from the initial estimate, so fewer inner iterations are needed and pathological steps are less likely.
SAEM: during the exploration phase, Metropolis–Hastings proposals are centred on mu_k instead of on the current chain state. This helps the chain escape the (incorrect) eta = 0 basin when TVCL is still far from the true value. During the convergence phase the proposal reverts to a symmetric random walk so that detailed balance holds.

Because the estimator is MLE in both cases, models that converge without mu-referencing will converge to the same estimates with it — just (usually) in fewer iterations.

Disabling it

If you want to benchmark against the pre-2026 behaviour, set:

[fit_options]
  mu_referencing = false

or, from the Rust API, FitOptions { mu_referencing: false, .. }. The default is true.

Porting from NONMEM

If you have a NONMEM model that uses an explicit MU_1 = LOG(THETA(1)) line, just drop the MU_i intermediate and write the individual parameter directly — ferx will detect the equivalent mu-reference automatically and the fit will be equivalent.

Can I use `if` / `else` statements like in NONMEM or nlmixr2?

Yes. ferx supports two forms of conditional logic in both [individual_parameters] and [odes] blocks:

# Block form
if (WT > 70) {
  CL = TVCL * (WT / 70)^0.75 * exp(ETA_CL)
} else if (SEX == 1) {
  CL = TVCL * 1.2 * exp(ETA_CL)
} else {
  CL = TVCL * exp(ETA_CL)
}

# Inline (ternary) form
CL = if (SEX == 1) TVCL * 1.5 else TVCL

Conditions support comparisons (<, <=, >, >=, ==, !=) and logical operators (&&, ||, !). Either form may be combined with arbitrary arithmetic, including covariate references and other ETAs.

Compared to NONMEM: NONMEM uses Fortran-style IF (cond) THEN ... ELSE ... ENDIF and Fortran comparison aliases (.GT., .LE., etc.). ferx uses C-style braces and operator symbols. The semantics are otherwise the same — exactly one branch fires per evaluation, the rest are ignored.

Compared to nlmixr2: nlmixr2's if (cond) { ... } else { ... } syntax is a near-verbatim match for ferx's block form. Drop the <- assignments in favour of = and the model translates directly.

Caveat: when a parameter is assigned inside an if block, ferx skips mu-reference detection for that parameter (the (ETA → THETA) link is no longer unconditional). See Individual Parameters for the workaround if you need both conditional logic and mu-referencing on the same parameter.

A note on == and !=: these operators do an exact bitwise float comparison. They work as you'd expect for integer-coded covariates (SEX == 1, STUDY != 3) but are unreliable on continuous values, where floating-point round-off can flip the result. For continuous thresholds prefer a bracketed comparison (WT >= 70 && WT < 80) over WT == 75.

Which outer optimizer should I pick?

slsqp (the default) is the right choice for most models — it is fast, handles box constraints cleanly, and behaves well on the log-transformed parameter scale that ferx uses internally.

Reach for a different optimizer when SLSQP misbehaves:

bobyqa — derivative-free, good when FOCE's FD gradients are noisy and SLSQP stalls or oscillates. Slower per iteration on smooth problems, but often converges when gradient-based methods give up.
trust_region — second-order Newton trust-region. Can be faster near convergence because it uses curvature information; tune the CG budget with steihaug_max_iters (default 50) if you have more than ~50 packed parameters.
lbfgs / bfgs — fall back to these only when NLopt is unavailable.

See Fit Options for the full list.

FeRx-NLME Documentation