Introduction
FeRx-NLME is a high-performance Nonlinear Mixed Effects (NLME) modeling engine written in Rust. It is designed for population pharmacokinetic (PopPK) analysis, implementing the same statistical methodology as established tools like NONMEM, Monolix, and Pumas.
Key Features
- FOCE/FOCEI estimation -- First-Order Conditional Estimation with optional interaction, the gold standard for PopPK
- SAEM estimation -- Stochastic Approximation EM for robust convergence on complex models
- Analytical PK solutions -- Built-in one- and two-compartment models (IV bolus, oral, infusion) with numerical stability guarantees
- ODE solver -- Dormand-Prince RK45 adaptive integrator for custom kinetic models (e.g. Michaelis-Menten)
- Automatic differentiation -- Enzyme-based reverse-mode AD for fast, exact gradients
- Simple model DSL -- Declarative
.ferxmodel files that read like equations - NONMEM-compatible data -- Reads standard NONMEM CSV datasets directly
- Covariate support -- Time-constant and time-varying covariates with automatic detection
- Parallel estimation -- Per-subject computations parallelized via Rayon
How It Compares
| Feature | FeRx | NONMEM | Monolix | Pumas |
|---|---|---|---|---|
| Language | Rust | Fortran | C++ | Julia |
| FOCE/FOCEI | Yes | Yes | No | Yes |
| SAEM | Yes | No | Yes | Yes |
| Analytical PK | Yes | Yes | Yes | Yes |
| ODE models | Yes | Yes | Yes | Yes |
| Auto-diff | Yes (Enzyme) | No | No | Yes (ForwardDiff) |
| Open source | Yes | No | No | No |
Architecture Overview
FeRx uses a two-level optimization structure:
- Outer loop: Optimizes population parameters (theta, omega, sigma) using NLopt or built-in BFGS
- Inner loop: For each subject, finds empirical Bayes estimates (EBEs) of random effects by minimizing individual negative log-likelihood
Parameters are internally transformed for unconstrained optimization: theta and sigma are log-transformed, and omega uses Cholesky factorization to guarantee positive-definiteness.
Project Structure
src/
types.rs -- Core data structures
api.rs -- Public API (fit, simulate, predict)
parser/ -- .ferx model file parser
pk/ -- Analytical PK solutions
ode/ -- ODE solver and predictions
estimation/ -- FOCE, FOCEI, SAEM algorithms
stats/ -- Likelihood and residual computations
io/ -- Data reading and output writing
ad/ -- Automatic differentiation
bin/run_model.rs -- CLI binary
License
FeRx-NLME is released under the MIT License. Copyright © 2026 InsightRX, Inc.
Getting Started
This section covers installing FeRx and running your first model.
- Installation -- Setting up the Rust toolchain and building FeRx
- Quick Start -- Running a complete PopPK analysis in under 5 minutes
Installation
FeRx requires a nightly Rust toolchain with the Enzyme LLVM plugin for automatic differentiation. As of 2026, Enzyme is not yet distributed via rustup, so a one-time plugin build is required.
Pick your platform:
- Linux — fully supported, one-time ~30 min plugin build
- macOS — supported with caveats
- Windows — not supported (see why below)
Linux
Tested on Ubuntu 22.04. Other distributions (Debian, Fedora, Arch) should work with adjustments to the package manager commands.
1. Install rustup + upstream nightly
Do not use snap's rustup — its filesystem confinement breaks on non-standard home directories (common on enterprise servers).
# Remove snap rustup if you had it:
sudo snap remove rustup
# Official installer
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup toolchain install nightly
2. Install system dependencies and matching LLVM
Check which LLVM major version nightly needs:
rustc +nightly --version --verbose | grep LLVM
# e.g. "LLVM version: 22.1.2" — major is 22
Use that major version (<MAJOR>) below.
sudo apt install -y cmake ninja-build clang libssl-dev pkg-config \
python3 build-essential curl git libzstd-dev
# Install matching LLVM from apt.llvm.org (Ubuntu's defaults lag behind):
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh <MAJOR>
# Fix GPG keyring permissions if apt warns:
sudo chmod 644 /etc/apt/trusted.gpg.d/apt.llvm.org.asc
sudo apt update
sudo apt install -y llvm-<MAJOR>-dev clang-<MAJOR>
For other distros, install the matching llvm-dev + clang packages from your package manager. The LLVM major version must exactly match what rustc reports.
3. Build the Enzyme plugin
git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build
cmake -G Ninja .. \
-DLLVM_DIR=/usr/lib/llvm-<MAJOR>/lib/cmake/llvm \
-DENZYME_CLANG=OFF \
-DENZYME_FLANG=OFF
ninja
# 15–30 min
4. Install the plugin into nightly's sysroot
This location is not obvious — rustc looks in lib/rustlib/<target>/lib/, not just lib/. Despite the error wording ("folder not found"), it's searching for a file:
SYSROOT=$(rustc +nightly --print sysroot)
TARGET=x86_64-unknown-linux-gnu # or aarch64-unknown-linux-gnu on ARM
cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.so \
$SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.so
Note the filename rewrite: LLVMEnzyme-<N>.so → libEnzyme-<N>.so (with lib prefix).
5. Register the toolchain as enzyme
ferx's build system pins to a toolchain named enzyme:
rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme --version
6. Verify
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
Expected: error[E0601]: main function not found. That's the success signal. If you see autodiff backend not found in the sysroot, the .so is missing or in the wrong place (step 4).
Multi-user servers
For shared Linux servers, stage the built toolchain in /opt/rust-nightly so all users can share one build:
sudo mkdir -p /opt/rust-nightly
sudo cp -a ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/. /opt/rust-nightly/
sudo chown -R root:root /opt/rust-nightly
sudo chmod -R a+rX /opt/rust-nightly
sudo chmod a+rx /opt/rust-nightly/bin/*
Each user then links the shared toolchain into their own rustup:
# per user, once
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain none
rustup toolchain link enzyme /opt/rust-nightly
For R users, add to ~/.Renviron:
PATH=/opt/rust-nightly/bin:${HOME}/.cargo/bin:${PATH}
RUSTUP_TOOLCHAIN=enzyme
macOS
Supported on both Intel and Apple Silicon, with caveats around LLVM installation.
1. Install Xcode Command Line Tools
xcode-select --install
2. Install rustup + upstream nightly
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup toolchain install nightly
3. Install matching LLVM via Homebrew
Check which LLVM major version nightly needs:
rustc +nightly --version --verbose | grep LLVM
Homebrew only ships a handful of LLVM versions at a time. If the version you need is on Homebrew:
brew install llvm@<MAJOR>
# Path will be /opt/homebrew/opt/llvm@<MAJOR> (Apple Silicon)
# or /usr/local/opt/llvm@<MAJOR> (Intel)
If Homebrew doesn't have your version, you'll need to build LLVM from source (much longer — consider sticking with a slightly older nightly that matches an available LLVM version).
4. Build the Enzyme plugin
git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build
# Adjust paths for your brew prefix (/opt/homebrew or /usr/local)
cmake -G Ninja .. \
-DLLVM_DIR=$(brew --prefix)/opt/llvm@<MAJOR>/lib/cmake/llvm \
-DENZYME_CLANG=OFF \
-DENZYME_FLANG=OFF
ninja
You may need to install ninja if not present: brew install ninja cmake.
5. Install the plugin into nightly's sysroot
SYSROOT=$(rustc +nightly --print sysroot)
# On Apple Silicon:
TARGET=aarch64-apple-darwin
# On Intel:
# TARGET=x86_64-apple-darwin
# Note: on macOS the shared library extension is .dylib, not .so
cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.dylib \
$SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.dylib
6. Register and verify
rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
# Expect: error[E0601]: `main` function not found
macOS caveats
- Apple Silicon (M1/M2/M3/M4): use
aarch64-apple-darwinasTARGET. Intel Macs usex86_64-apple-darwin - System Integrity Protection (SIP): If you see code-signing errors loading the
.dylibduringrustcinvocation, trysudo codesign --force --sign - <path-to-libEnzyme.dylib>— should be rare on dev machines - Nightly toolchain distribution mismatches: Apple Silicon nightlies occasionally lag x86_64 by a day or two; if LLVM version mismatches after
rustup update, prefer installing a specific dated nightly
Windows
Windows is not currently supported. The blockers are:
-
EnzymeAD plugin build on MSVC is not well-tested. The Enzyme project primarily targets Linux with LLVM clang/gcc. While Windows builds exist in theory, the toolchain integration (rustc sysroot path conventions, plugin discovery, MSVC vs MinGW linking) has known issues that haven't been worked through.
-
Rust autodiff feature gate interactions with MSVC codegen. The
-Z autodiff=Enablepath has not been exercised on the x86_64-pc-windows-msvc target. Anecdotal reports from the Rust autodiff tracking issue show crashes or "backend not found" errors on Windows even when the plugin.dllis present. -
We haven't set up CI for Windows. Without a green CI signal we can't promise the build works.
Workarounds for Windows users
- WSL2 (recommended): Install Ubuntu under WSL2 and follow the Linux instructions. Performance is near-native for CPU-bound workloads like model fitting.
- Docker: Use the forthcoming ferx-nlme Docker image (coming soon) which ships with the toolchain pre-installed.
- Remote dev: SSH into a Linux server or cloud VM.
Future Windows support
Once upstream Enzyme integration lands in rustup (tracked at rust-lang/rust autodiff tracking issue), Windows support should become straightforward since the plugin distribution problem goes away. No ETA; the feature is still experimental in rustc.
If you're a Windows developer who would like to help, a CI run against x86_64-pc-windows-msvc + contributed docs for that platform would be very welcome.
Building ferx-nlme from source
Once your Enzyme toolchain is set up:
git clone https://github.com/InsightRX/ferx-nlme
cd ferx-nlme
cargo build --release --features autodiff
# Binary at target/release/ferx
Build options
# Debug build (faster compile, slower runtime)
cargo build --features autodiff
# Quick type-check without building
cargo check --features autodiff
# Lints
cargo clippy --features autodiff
# CI build without autodiff (uses finite differences — no Enzyme needed)
cargo build --release --no-default-features --features ci
The ci feature is useful for development on machines without the full Enzyme toolchain — at the cost of much slower gradient computation.
Verify the build
cargo run --release --features autodiff --bin ferx -- examples/warfarin.ferx --simulate
Should print a successful model fit with parameter estimates.
Installing the ferx R package
Ensure the Enzyme toolchain is set up (above). Then:
devtools::install_github("InsightRX/ferx")
The R package's build is driven by its Makevars, which invokes cargo and resolves the enzyme toolchain via rustup. Both must be set up correctly in your shell/Renviron before calling install_github.
See the ferx R package README for API usage.
Cancelling a running fit (Ctrl-C)
ferx_fit() runs the estimator on a worker thread and polls for R interrupts on the main thread every ~100 ms, so Ctrl-C (or RStudio's red stop button) aborts the fit cleanly. The worker exits at the next safe checkpoint — typically within a second or two, but up to one inner-loop evaluation for heavy ODE models. The call then returns with an R error:
Error: ferx_fit: cancelled by user
Ctrl-Z (SIGTSTP) will not abort the fit — it suspends the whole R process to the shell. Use Ctrl-C instead.
Dependencies
FeRx depends on these crates (managed automatically by Cargo):
| Crate | Purpose |
|---|---|
nalgebra | Linear algebra (matrices, Cholesky) |
nlopt | Nonlinear optimization (SLSQP, L-BFGS, MMA) |
rayon | Parallel computation |
rand, rand_distr | Random number generation (SAEM, SIR) |
csv | CSV data file reading |
regex | Model file expression parsing |
The nlopt crate requires the NLopt C library. Most platforms handle this automatically; if the build fails on NLopt, install it via your system package manager:
# macOS
brew install nlopt
# Ubuntu/Debian
sudo apt install libnlopt-dev
# Fedora
sudo dnf install NLopt-devel
Troubleshooting
"error: the option 'Z' is only accepted on the nightly compiler"
Your shell (or R) is finding a non-nightly rustc. Check rustc --version and your PATH. For R, verify Sys.which("rustc") and ~/.Renviron.
"autodiff backend not found in the sysroot: failed to find a libEnzyme-<N> folder"
Despite the wording ("folder"), rustc is searching for a file. Causes:
- Wrong directory: the
.so/.dylibis in<sysroot>/lib/instead of<sysroot>/lib/rustlib/<target>/lib/ - LLVM version mismatch: rebuild Enzyme against the LLVM version rustc reports
- Filename: must be
libEnzyme-<MAJOR>.so(Linux) orlibEnzyme-<MAJOR>.dylib(macOS), withlibprefix
"custom toolchain 'enzyme' specified in override file ... is not installed"
Run rustup toolchain link enzyme <path-to-nightly-sysroot>.
"not a directory: '/<path>/lib'" from rustup toolchain link
Permission issue — the user running toolchain link can't read the target path. On shared installs, run sudo chmod -R a+rX /opt/rust-nightly.
"incorrect value 'X' for unstable option 'autodiff'"
Valid autodiff values change between nightly builds. Test with:
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
If Enable is rejected, try LooseTypes, Inline, or PrintTA.
"Enzyme: cannot handle (forward) unknown intrinsic llvm.maximumnum"
Recent rustc lowers f64::max()/f64::min() to intrinsics Enzyme can't yet differentiate. This is a ferx-nlme code-level issue — AD-instrumented functions must use manual if expressions. Should not happen on released ferx-nlme versions; if it does, please file an issue.
"cargo is unavailable for the active toolchain" (info, not error)
Cargo wasn't copied into your linked toolchain. Either copy it (cp ~/.cargo/bin/cargo /opt/rust-nightly/bin/) or ignore — rustup falls back to nightly's cargo, which works.
Refreshing when upstream nightly rolls forward
If a new ferx-nlme release references stdlib items your cached toolchain doesn't have (e.g. autodiff_forward not found), rebuild /opt/rust-nightly against the current nightly, and rebuild Enzyme if the LLVM major version changed.
Quick Start
This walkthrough fits a one-compartment oral PK model to warfarin data.
1. Create a model file
Save the following as warfarin.ferx:
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
maxiter = 300
covariance = true
2. Prepare your data
Data must be in NONMEM CSV format. The minimum required columns are ID, TIME, and DV. Dosing records use EVID=1 with AMT:
ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,1,14.42,0,.,.,0
1,2,17.56,0,.,.,0
1,4,15.23,0,.,.,0
1,8,10.15,0,.,.,0
1,12,6.75,0,.,.,0
1,24,2.24,0,.,.,0
3. Run the fit
ferx warfarin.ferx --data warfarin.csv
4. Interpret the output
The console output shows the estimation progress and final results:
Starting FOCE estimation...
10 subjects, 110 observations
3 thetas, 3 etas, 1 sigmas
...
Final OFV = -280.1838
============================================================
NONLINEAR MIXED EFFECTS MODEL ESTIMATION
============================================================
Converged: YES
Estimation method: FOCE
--- Objective Function ---
OFV: -280.1838
AIC: -266.1838
BIC: -247.2804
--- THETA Estimates ---
Parameter Estimate SE %RSE
----------------------------------------------------
TVCL 0.132735 0.014549 11.0
TVV 7.694842 0.293028 3.8
TVKA 0.757498 0.034986 4.6
--- OMEGA Estimates (variances) ---
OMEGA(1,1) = 0.028584 (CV% = 16.9) SE = 0.006394
OMEGA(2,2) = 0.009613 (CV% = 9.8) SE = 0.002165
OMEGA(3,3) = 0.340868 (CV% = 58.4) SE = 0.076351
--- SIGMA Estimates ---
SIGMA(1) = 0.010638 SE = 0.000788
5. Output files
Three files are generated:
| File | Contents |
|---|---|
warfarin-sdtab.csv | Per-observation diagnostics (PRED, IPRED, CWRES, IWRES, ETAs) |
warfarin-fit.yaml | Parameter estimates with standard errors in YAML format |
warfarin-timing.txt | Wall-clock estimation time |
Next Steps
- Learn the full Model File Reference
- Explore Estimation Methods (FOCE, FOCEI, SAEM)
- See more Examples
Model File Reference
FeRx models are defined in .ferx files using a declarative DSL. Each file is organized into blocks, each denoted by a [block_name] header.
Block Overview
| Block | Required | Purpose |
|---|---|---|
[parameters] | Yes | Define theta, omega, and sigma parameters |
[individual_parameters] | Yes | Map population parameters to individual PK parameters |
[structural_model] | Yes | Specify the PK model (analytical or ODE) |
[error_model] | Yes | Define the residual error model |
[odes] | If ODE | ODE right-hand-side equations |
[fit_options] | No | Configure estimation method and optimizer |
[simulation] | No | Define a simulation trial design |
Minimal Example
[parameters]
theta TVCL(0.1, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
sigma ADD_ERR ~ 1.0
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
[structural_model]
pk one_cpt_iv_bolus(cl=CL, v=V)
[error_model]
DV ~ additive(ADD_ERR)
Lines beginning with # are treated as comments.
Parameters
The [parameters] block defines all model parameters: fixed effects (theta), between-subject variability (omega), and residual error (sigma).
Theta (Fixed Effects)
theta NAME(initial_value, lower_bound, upper_bound)
- NAME: Parameter name (used in
[individual_parameters]expressions) - initial_value: Starting value for estimation
- lower_bound: Lower bound constraint (must be > 0; parameters are log-transformed internally)
- upper_bound: Upper bound constraint
Example:
theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
Omega (Between-Subject Variability)
Diagonal omega
omega NAME ~ variance
- NAME: Random effect name (used in
[individual_parameters]asETA_XXX) - variance: Initial variance estimate (diagonal element of the omega matrix)
Example:
omega ETA_CL ~ 0.07
omega ETA_V ~ 0.02
omega ETA_KA ~ 0.40
Each variance represents the between-subject variability for that parameter. The coefficient of variation (CV%) is approximately sqrt(variance) * 100 for log-normally distributed parameters. For example, omega ETA_CL ~ 0.09 corresponds to ~30% CV.
Block omega (correlated random effects)
To estimate correlations between random effects, use block_omega:
block_omega (NAME1, NAME2, ...) = [lower_triangle_values]
The values are the lower triangle of the covariance matrix, specified row-wise. For a 2x2 block:
block_omega (ETA_CL, ETA_V) = [var_CL, cov_CL_V, var_V]
For a 3x3 block:
block_omega (ETA_CL, ETA_V, ETA_KA) = [var_CL, cov_CL_V, var_V, cov_CL_KA, cov_V_KA, var_KA]
Visualising the lower triangle
For an N×N block you supply N·(N+1)/2 values, walked row by row through the lower triangle (row 1 has 1 value, row 2 has 2, …, row N has N). Drawing it out as a matrix makes the layout obvious:
ETA_1 ETA_2 ETA_3 ETA_4
┌ ┐
ETA_1 │ v[0] │
ETA_2 │ v[1] v[2] │
ETA_3 │ v[3] v[4] v[5] │
ETA_4 │ v[6] v[7] v[8] v[9] │
└ ┘
The diagonal entries are the variances; the off-diagonals are the covariances (which the optimiser estimates — set them to 0.0 as initial values and the fit will recover correlations from the data).
Converting diagonal omegas to a block
A common pattern: you have all-diagonal omegas and want to replace them with a single fully-correlated block. Each omega variance becomes a diagonal entry; off-diagonals start at 0.0. For example:
omega ETA_CL ~ 0.1
omega ETA_V1 ~ 0.1
omega ETA_Q ~ 0.1
omega ETA_V2 ~ 0.1
becomes a 4×4 block (10 values: 1+2+3+4):
block_omega (ETA_CL, ETA_V1, ETA_Q, ETA_V2) = [
0.1,
0.0, 0.1,
0.0, 0.0, 0.1,
0.0, 0.0, 0.0, 0.1
]
(Multi-line and single-line layouts are equivalent — line breaks inside [ ... ] are ignored.) Reading the rows: row 1 is var(CL), row 2 is cov(CL,V1) var(V1), row 3 is cov(CL,Q) cov(V1,Q) var(Q), row 4 is cov(CL,V2) cov(V1,V2) cov(Q,V2) var(V2).
Block omega regularises poorly-identified parameters by letting them share variability — see the optimisation FAQ for a worked example where switching from diagonal to block_omega(3) rescues a fit that diagonal-omega couldn't converge.
Mixing diagonal and block
You can mix diagonal and block omega specifications. Diagonal omegas specify uncorrelated random effects, while block omegas estimate the full covariance sub-matrix:
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40
This estimates a 3x3 omega where ETA_CL and ETA_V are correlated (2x2 block), but ETA_KA is uncorrelated with both.
Declaration order
The order of omega and block_omega lines in the [parameters] block determines the ETA indexing throughout the model: in the omega matrix and all output. For example:
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40
produces ETA order [ETA_CL, ETA_V, ETA_KA] (indices 1, 2, 3), while:
omega ETA_KA ~ 0.40
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
produces [ETA_KA, ETA_CL, ETA_V] (indices 1, 2, 3). The [individual_parameters] block should list assignments in the same order for clarity, though the parameter mapping is by name, not position.
Kappa (Inter-Occasion Variability)
Inter-Occasion Variability (IOV) is declared with kappa (independent per-parameter) or block_kappa (correlated across parameters). Kappa parameters must be paired with iov_column in [fit_options] and an occasion column in the dataset.
Diagonal kappa — Option A
kappa NAME ~ variance
kappa NAME ~ variance FIX
Each kappa line adds one diagonal element to the IOV omega matrix. Occasions are independent.
Example:
kappa KAPPA_CL ~ 0.05
kappa KAPPA_V ~ 0.03
kappa KAPPA_CL ~ 0.02 FIX
Block kappa — Option B (correlated IOV)
block_kappa (NAME1, NAME2, ...) = [lower_triangle_values]
block_kappa (NAME1, NAME2, ...) = [lower_triangle_values] FIX
Mirrors the block_omega syntax — values are the lower triangle of the IOV covariance matrix, row-wise. Use this when IOV effects are expected to be correlated (e.g. correlated occasion shifts in CL and V).
Example (2×2 block):
block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]
where 0.05 = Var(KAPPA_CL), 0.01 = Cov(KAPPA_CL, KAPPA_V), 0.03 = Var(KAPPA_V).
Using kappas in individual parameters
Reference kappa names exactly like BSV etas in [individual_parameters]:
[individual_parameters]
CL = TVCL * exp(ETA_CL + KAPPA_CL)
V = TVV * exp(ETA_V + KAPPA_V)
KA = TVKA * exp(ETA_KA) # no IOV on absorption
Kappas can be combined freely — a parameter can carry BSV only, IOV only, or both.
Mixed diagonal and block kappa
You can mix kappa (uncorrelated) and block_kappa (correlated) declarations in the same model:
block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]
kappa KAPPA_KA ~ 0.10
A name may not appear in both kappa and block_kappa — this is a parse error.
Declaration order
Kappa declaration order determines the IOV omega matrix layout and the kappa column order in output. Kappas follow all BSV etas in the internal indexing: if a model has 3 BSV etas and 2 kappas, the kappas sit at indices 4 and 5.
Parameter-level correlation in output
When a block_omega (or block_kappa) is estimated, ferx reports a parameter-level correlation for each off-diagonal pair in the console and YAML output. This differs from the eta-level (normal-scale) correlation ω_ij / √(ω_ii · ω_jj):
Both etas lognormal (THETA * exp(ETA)) | (exp(ω_ij) − 1) / √((exp(ω_ii) − 1)(exp(ω_jj) − 1)) |
|---|---|
Both etas additive (THETA + ETA) | ω_ij / √(ω_ii · ω_jj) (same as eta-level) |
| Mixed or complex expressions | Falls back to eta-level; a warning is added to FitResult.warnings |
The formula for lognormal pairs is the standard bivariate lognormal identity and reflects the correlation between the actual PK/PD parameters (e.g. CL and V) rather than their underlying normal variates.
The result is exposed as FitResult.omega_param_corr (BSV) and FitResult.omega_iov_param_corr (IOV), and is used wherever ferx prints a corr or correlation value for a block omega pair.
Sigma (Residual Error)
sigma NAME ~ value
- NAME: Residual error parameter name (referenced in
[error_model]) - value: Initial value
Example:
sigma PROP_ERR ~ 0.01
sigma ADD_ERR ~ 1.0
The interpretation of sigma depends on the error model:
| Error Model | Sigma Meaning |
|---|---|
| Additive | Standard deviation of additive error |
| Proportional | Coefficient of proportional error |
| Combined | First sigma = proportional coefficient, second = additive SD |
Complete Examples
Diagonal omega (no correlations):
[parameters]
theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
omega ETA_CL ~ 0.07
omega ETA_V ~ 0.02
omega ETA_KA ~ 0.40
sigma PROP_ERR ~ 0.01
Block omega (correlated CL and V):
[parameters]
theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40
sigma PROP_ERR ~ 0.01
Individual Parameters
The [individual_parameters] block defines how population parameters (theta), random effects (eta), and covariates combine to produce individual PK parameters.
Syntax
PARAM = expression
Each line assigns a PK parameter using an arithmetic expression that can reference:
- Theta parameters -- names defined in
[parameters](e.g.,TVCL,TVV) - Eta random effects -- names defined as omega parameters (e.g.,
ETA_CL,ETA_V) - Covariates -- column names from the data file (e.g.,
WT,CRCL) - Constants -- numeric literals (e.g.,
70,0.75)
Supported Operators and Functions
| Operator/Function | Example |
|---|---|
+, -, *, / | TVCL * WT / 70 |
^ (power) | (WT/70)^0.75 |
exp() | exp(ETA_CL) |
log(), ln() | log(TVCL) |
sqrt() | sqrt(WT) |
abs() | abs(ETA_CL) |
| Parentheses | TVCL * (WT/70)^0.75 |
Comparisons (in if conditions) | WT > 70, SEX == 1, AGE != 0 |
Logical (in if conditions) | && (and), || (or), ! (not) |
| Inline conditional | if (SEX == 1) TVCL * 1.5 else TVCL |
Conditional Logic (if / else)
Two forms are supported and may be combined freely.
Block form
[individual_parameters]
if (WT > 70) {
CL = TVCL * (WT / 70)^0.75 * exp(ETA_CL)
} else if (SEX == 1) {
CL = TVCL * 1.2 * exp(ETA_CL)
} else {
CL = TVCL * exp(ETA_CL)
}
V = TVV * exp(ETA_V)
The block form is appropriate when the body contains multiple statements or
when several alternative branches are needed. Conditions support comparison
operators (<, <=, >, >=, ==, !=) and logical operators (&&,
||, !); parentheses group sub-conditions.
Inline (ternary) form
[individual_parameters]
CL = if (SEX == 1) TVCL * 1.5 else TVCL
V = TVV * exp(ETA_V)
The inline form produces a value and can appear anywhere an expression is
allowed. Both then and else branches are required.
Interaction with mu-referencing
When the assignment to a parameter is wrapped in an if block, the
(ETA → THETA) relationship is no longer unconditional, so ferx skips
mu-reference detection for that parameter. Unconditional assignments in the
same block continue to be detected normally.
Tip: if you want mu-referencing for a covariate-adjusted parameter, keep the assignment unconditional and bury the conditional inside the covariate term (e.g.
CL = TVCL * (if (WT > 70) (WT / 70)^0.75 else 1.0) * exp(ETA_CL)).
Common Parameterizations
Exponential (log-normal) random effects
The standard approach for PK parameters that must be positive:
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
Allometric scaling with covariates
[individual_parameters]
CL = TVCL * (WT/70)^0.75 * exp(ETA_CL)
V = TVV * (WT/70)^1.0 * exp(ETA_V)
Estimated covariate effects
Use additional theta parameters for covariate coefficients:
[parameters]
theta TVCL(0.134, 0.001, 10.0)
theta THETA_WT(0.75, 0.01, 2.0)
theta THETA_CRCL(0.5, 0.01, 2.0)
[individual_parameters]
CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
Logit-normal bioavailability
Use inv_logit(logit(THETA_F) + ETA_F) to constrain bioavailability to (0, 1).
The starting value for THETA_F is set directly on the (0, 1) scale — whatever
fraction you specify is what the optimiser uses as the typical F:
[parameters]
theta THETA_F(0.70, 0.001, 0.999) # typical bioavailability = 70%
omega ETA_F ~ 0.10 # BSV on the logit scale
[individual_parameters]
F = inv_logit(logit(THETA_F) + ETA_F)
- When
ETA_F = 0,F_i = THETA_Fexactly (the logit and inv_logit cancel). ETA_Fshifts each individual's F on the logit scale, symmetrically around the typical value.- The estimated
THETA_Fin the output is directly interpretable as the typical bioavailability. omega ETA_Fis variance on the logit scale — not the variance of F itself.
The alternative form inv_logit(THETA_F + ETA_F) is also supported, where
THETA_F is on the logit scale (e.g., logit(0.70) ≈ 0.847). This is
less readable but may be useful when comparing with NONMEM models.
Automatic MU-referencing
When a line matches one of the patterns
PARAM = THETA * exp(ETA) # multiplicative / log-normal
PARAM = THETA * <anything> * exp(ETA) # multiplicative with covariate terms
PARAM = exp(log(THETA) + ETA) # canonical MU form
PARAM = THETA + ETA # additive
ferx records the (ETA → THETA) mapping and uses it to re-centre the inner-loop ETA search at each outer iteration — reproducing NONMEM / nlmixr2's MU-referencing behaviour without requiring you to write an explicit MU_i line. See the FAQ for details on which patterns are and are not detected, and for the mu_referencing = false escape hatch.
Covariate Detection
Any uppercase identifier in the expression that does not match a theta name or eta name is automatically treated as a covariate. The covariate value is read from the corresponding column in the data file.
For example, in CL = TVCL * (WT/70)^0.75 * exp(ETA_CL):
TVCLmatches a theta parameterETA_CLmatches an omega parameterWTmatches neither, so it is treated as a covariate column
PK Parameter Names
The parameter names on the left side of each assignment must map to recognized PK parameter names:
| Name | PK Parameter |
|---|---|
CL | Clearance |
V or V1 | Volume of distribution (central compartment) |
Q | Intercompartmental clearance |
V2 | Peripheral volume |
KA | Absorption rate constant |
F | Bioavailability (default 1.0 if omitted) |
For ODE models, the parameter names are user-defined and passed as a flat vector to the ODE right-hand side function.
Structural Model
The [structural_model] block specifies the pharmacokinetic model used to generate predictions.
Analytical PK Models
For standard compartmental models, use the pk keyword with a model function:
pk MODEL_NAME(param=VALUE, param=VALUE, ...)
Available Models
| Model Function | Compartments | Route | Required Parameters |
|---|---|---|---|
one_cpt_iv_bolus | 1 | IV bolus | cl, v |
one_cpt_oral | 1 | Oral | cl, v, ka |
one_cpt_infusion | 1 | IV infusion | cl, v |
two_cpt_iv_bolus | 2 | IV bolus | cl, v1, q, v2 |
two_cpt_oral | 2 | Oral | cl, v1, q, v2, ka |
two_cpt_infusion | 2 | IV infusion | cl, v1, q, v2 |
Examples
One-compartment oral:
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
Two-compartment IV bolus:
[structural_model]
pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)
Two-compartment oral:
[structural_model]
pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)
Bioavailability
For oral models, bioavailability (F) defaults to 1.0. To estimate it, define an F parameter in [individual_parameters] -- it will be automatically used by the oral PK functions.
Dose Handling
Analytical models support:
- Bolus doses: Instantaneous input (default when
RATE=0in data) - Infusions: Zero-order input (when
RATE>0in data) - Steady-state: Pre-computed steady-state concentrations (when
SS=1andII>0in data) - Dose superposition: Multiple doses are handled by summing contributions from each dose event
Numerical Stability
The analytical solutions include special handling for:
- Near-equal rate constants: When absorption and elimination rates are similar (KA ~ k), L'Hopital's rule is used to avoid division by zero
- Two-compartment eigenvalues: Vieta's formula is used for robust computation of alpha and beta
ODE Models
For non-standard kinetics (e.g., saturable elimination), use the ODE specification:
[structural_model]
ode(obs_cmt=COMPARTMENT_NAME, states=[state1, state2, ...])
See ODE Models for full ODE syntax.
Error Model
The [error_model] block defines the residual error structure, specifying how observed data (DV) relates to model predictions.
Syntax
DV ~ ERROR_TYPE(SIGMA_PARAMS)
Available Error Models
Additive
DV ~ additive(SIGMA_NAME)
The residual variance is constant across all predictions:
\[ \text{Var}(DV) = \sigma^2 \]
Use when measurement error is independent of concentration (e.g., assay with fixed precision).
Proportional
DV ~ proportional(SIGMA_NAME)
The residual variance scales with the predicted value:
\[ \text{Var}(DV) = (\sigma \cdot f)^2 \]
where \( f \) is the model prediction. Use when measurement error increases with concentration (most common in PK).
Combined
DV ~ combined(SIGMA_PROP, SIGMA_ADD)
Combines proportional and additive components:
\[ \text{Var}(DV) = (\sigma_1 \cdot f)^2 + \sigma_2^2 \]
Use when both proportional and additive error sources are present. Requires two sigma parameters defined in [parameters].
Sigma scale
All sigma parameters are estimated and reported on the standard-deviation scale, not the variance scale. This is true for both proportional and additive components, and for both elements of a combined error model.
In particular:
| Error model | Initial value sigma = X means … |
|---|---|
proportional | The residual SD scales as X · f, i.e. CV% = X · 100 |
additive | The residual SD is X in the units of DV (no CV interpretation) |
combined | First sigma is proportional (CV-style), second is additive (units of DV) |
So sigma PROP_ERR ~ 0.1 for a proportional model is a 10% CV initial value, not 1%. Likewise, the SE on a proportional sigma is on the SD scale — multiply by 100 for an SE in CV-percentage points.
The fitted YAML emits both the SD (estimate) and the variance (variance: estimate²); for proportional components it also emits cv_pct = estimate · 100 so downstream tooling does not have to re-derive it.
Examples
Proportional error (most common):
[parameters]
sigma PROP_ERR ~ 0.01
[error_model]
DV ~ proportional(PROP_ERR)
Additive error:
[parameters]
sigma ADD_ERR ~ 1.0
[error_model]
DV ~ additive(ADD_ERR)
Combined error:
[parameters]
sigma PROP_ERR ~ 0.1
sigma ADD_ERR ~ 0.5
[error_model]
DV ~ combined(PROP_ERR, ADD_ERR)
Impact on Estimation
The error model affects:
- Individual weighted residuals (IWRES):
(DV - IPRED) / sqrt(Var) - Conditional weighted residuals (CWRES): Accounts for uncertainty in random effect estimates
- Objective function value (OFV): The likelihood includes
log(Var)terms, so the error model structure directly influences parameter estimates
ODE Models
For pharmacokinetic models without analytical solutions (e.g., saturable elimination, target-mediated drug disposition), FeRx provides an ODE solver.
Structural Model Declaration
[structural_model]
ode(obs_cmt=OBSERVABLE_COMPARTMENT, states=[state1, state2, ...])
- obs_cmt: The compartment whose concentration is observed (matched to DV)
- states: List of state variable names (compartments)
ODE Equations
The [odes] block defines the right-hand side of the ODE system:
[odes]
d/dt(state_name) = expression
Expressions can reference:
-
State variables by name
-
Individual parameters defined in
[individual_parameters] -
Arithmetic operators and functions (
exp,log,sqrt, etc.) -
Conditional logic with the same
if (cond) { ... } else { ... }and inlineif (cond) expr else exprsyntax described in Individual Parameters. For example, you can switch between linear and saturable elimination based on the central amount:[odes] d/dt(depot) = -KA * depot if (central > KM_THRESHOLD) { d/dt(central) = KA * depot - VMAX * central / (KM + central) } else { d/dt(central) = KA * depot - CL_LIN * central }Each
d/dt(state)reachable from any branch counts as defined; states that aren't assigned in the firing branch this step receive a derivative of0.
Example: Michaelis-Menten Elimination
A one-compartment oral model with saturable (Michaelis-Menten) elimination:
[parameters]
theta TVVMAX(10.0, 0.1, 1000.0)
theta TVKM(2.0, 0.01, 100.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
omega ETA_VMAX ~ 0.09
omega ETA_V ~ 0.04
sigma PROP_ERR ~ 0.1
[individual_parameters]
VMAX = TVVMAX * exp(ETA_VMAX)
KM = TVKM
V = TVV * exp(ETA_V)
KA = TVKA
[structural_model]
ode(obs_cmt=central, states=[depot, central])
[odes]
d/dt(depot) = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)
[error_model]
DV ~ proportional(PROP_ERR)
Solver Details
FeRx uses a Dormand-Prince RK45 adaptive solver:
| Setting | Value |
|---|---|
| Method | Explicit Runge-Kutta 4(5) |
| Absolute tolerance | 1e-6 |
| Relative tolerance | 1e-4 |
| Max steps | 10,000 |
| Initial step size | 0.1 |
| Minimum step size | 1e-12 |
The solver automatically adapts step sizes based on local error estimates.
Dose Handling
- Bolus doses: Applied as instantaneous state changes at dose times. The dose amount is added to the target compartment:
state[cmt] += amt - Infusion doses (
RATE > 0): Treated as a continuous zero-order input. The integrator's timeline is broken at the infusion's end (time + amt/rate), and+rateis added to the target compartment's derivative for every segment fully spanned by the infusion. Overlapping infusions on the same compartment sum their rates - Compartment indexing: Compartments are 1-indexed in the data file (
CMT=1corresponds to the first state in thestateslist) - Multiple doses: The ODE is integrated in segments between dose events, with state discontinuities at each bolus
Limitations
- The observable compartment contains the amount (not concentration). Divide by volume in the ODE equations if needed
- Steady-state (
SS=1) is not directly supported for ODE models
Fit Options
The optional [fit_options] block configures the estimation method and optimizer settings.
Syntax
[fit_options]
key = value
General Options
| Key | Values | Default | Description |
|---|---|---|---|
method | foce, focei, saem | foce | Estimation method |
maxiter | integer | 500 | Maximum outer loop iterations |
covariance | true, false | true | Compute covariance matrix and standard errors |
optimizer | slsqp, lbfgs, nlopt_lbfgs, mma, bfgs, bobyqa, trust_region | slsqp | Optimization algorithm |
inner_maxiter | integer | 200 | Max iterations for the inner (per-subject EBE) optimizer |
inner_tol | float | 1e-4 | Gradient-norm convergence tolerance for the inner (per-subject EBE) optimizer. The default of 1e-4 matches the precision of typical NLME engines (NONMEM's default inner-loop SIGDIGITS is ~3, equivalent to ~1e-3). Tighter values (e.g. 1e-6, 1e-8) over-converge the EBE relative to the Sheiner–Beal linearisation error and can slow FOCEI fits by 10–15× without measurable change in the final OFV. Use a tighter value only if you're comparing post-hoc EBE values across runs at high precision. |
steihaug_max_iters | integer | 50 | Max CG iterations for the Steihaug subproblem (only used when optimizer = trust_region) |
global_search | true, false | false | Run NLopt CRS2-LM (Controlled Random Search with Local Mutation) as a gradient-free global pre-search before the local optimizer. CRS2-LM samples within the parameter bounds; the local optimizer (e.g. bobyqa, slsqp) starts from the best point found. Useful for poorly-identified models — when the local optimizer can land in a degenerate basin (collapsed ETA, V/Q swap, parameters at bounds) from a far-from-truth start, the global pre-search usually escapes it. Adds the pre-search budget on top of the local optimisation, but typically more efficient than running multiple full fits from scratch. Requires a full NLopt build (e.g. brew install nlopt or apt install libnlopt-dev); a clear warning is emitted if CRS2-LM is unavailable. |
global_maxeval | integer | 200 * (n_params + 1) | Maximum evaluations of the FOCE objective during the global pre-search. Each eval is a full inner-loop pass over all subjects, so this is the dominant cost of global_search = true. The default (0 → auto) is empirically enough to escape bad basins on 10–20 parameter PK models without dominating the wall time of the subsequent local refine. |
bloq_method | drop, m3 | drop | How to handle rows with CENS=1. m3 enables Beal's M3 likelihood (see BLOQ example). |
mu_referencing | true, false | true | Re-centre inner-loop ETA estimates on the current population mean (auto-detected from [individual_parameters]). See the FAQ entry for details. Set false to reproduce pre-automatic-mu behaviour. |
iov_column | string | — | Name of the occasion column in the dataset (e.g. OCC). Required when the model uses kappa or block_kappa declarations. The column must contain integer occasion indices. Case-insensitive. Only supported with foce / focei — not saem. See IOV documentation. |
optimizer_trace | true, false | false | Write a per-iteration CSV to /tmp/ferx_trace_<pid>_<ts>.csv. The path is stored in FitResult::trace_path. Useful for diagnosing convergence problems or comparing optimizers. See Optimizer Trace. |
Estimation Methods
FOCE (default)
method = foce
First-Order Conditional Estimation. Linearizes the model around the empirical Bayes estimates. Fast and reliable for most models.
FOCEI
method = focei
FOCE with Interaction. Includes the dependence of the residual error on random effects. More accurate than FOCE when the error model depends on individual predictions, but slightly slower.
SAEM
method = saem
Stochastic Approximation EM. Uses Metropolis-Hastings sampling instead of MAP optimization for random effects. More robust to local minima; recommended for complex models with many random effects.
SAEM-Specific Options
| Key | Default | Description |
|---|---|---|
n_exploration | 150 | Phase 1 iterations (step size = 1) |
n_convergence | 250 | Phase 2 iterations (step size = 1/k) |
n_mh_steps | 3 | Metropolis-Hastings steps per subject per iteration |
adapt_interval | 50 | Iterations between MH step-size adaptation |
seed | 12345 | RNG seed for reproducibility |
SIR (Sampling Importance Resampling)
SIR provides non-parametric parameter uncertainty estimates as an optional post-estimation step. Requires covariance = true.
| Key | Default | Description |
|---|---|---|
sir | false | Enable SIR uncertainty estimation |
sir_samples | 1000 | Number of proposal samples (M) |
sir_resamples | 250 | Number of resampled vectors (m) |
sir_seed | 12345 | RNG seed for reproducibility |
See SIR documentation for details.
Optimizer Choices
| Optimizer | Description | Recommended For |
|---|---|---|
slsqp | Sequential Least Squares Programming (NLopt) | General use (default) |
bfgs | Built-in BFGS quasi-Newton | When NLopt is unavailable |
lbfgs | Limited-memory BFGS | Large parameter spaces |
nlopt_lbfgs | NLopt L-BFGS | Alternative L-BFGS |
mma | Method of Moving Asymptotes (NLopt) | Constrained problems |
bobyqa | NLopt BOBYQA — derivative-free quadratic interpolation | Noisy or non-smooth objectives where FD gradients are unreliable |
trust_region | Newton trust-region with Steihaug CG subproblem (argmin) | Well-conditioned problems where second-order curvature helps convergence; tune steihaug_max_iters |
Notes:
bobyqadoes not use gradients, so it is robust to small discontinuities in the FOCE surface caused by EBE re-estimation, but it converges more slowly than gradient-based methods on smooth problems.trust_regionuses a finite-difference Hessian of the OFV-at-fixed-EBEs. Each Hessian costs O(n²) OFV evaluations, so it is fastest when the number of packed parameters is small. Increasesteihaug_max_iterswhen the parameter count exceeds the default of 50.
Parameter Scaling and EBE Convergence
| Key | Default | Description |
|---|---|---|
scale_params | true | Scale packed parameters to O(1) before passing to the optimizer. Improves conditioning when log-transformed parameters span several orders of magnitude. Set to false to disable (reproduces pre-PR2 behaviour). |
max_unconverged_frac | 0.1 | Fraction of subjects (with at least min_obs_for_convergence_check observations) allowed to have unconverged EBEs before the outer optimizer rejects the step (returns OFV = ∞). Set to 1.0 to disable the guard. |
min_obs_for_convergence_check | 2 | Subjects with fewer than this many observations are excluded from the max_unconverged_frac check (they still run normally). |
Options That Don't Apply to the Selected Method
If you set an option that the chosen estimation method doesn't consume
(e.g. n_convergence with method = focei, or optimizer with
method = saem), fit() emits a warning listing the option, the selected
method, and the keys that are available for that method. The option is
ignored — the fit still proceeds.
For a chained fit (method = [saem, focei]), an option is kept if it applies
to any stage in the chain, so SAEM and FOCEI keys can be mixed without
warnings.
Global Search
Setting global_search = true runs a gradient-free pre-search (NLopt CRS2-LM) before the local optimizer. This helps escape local minima on challenging datasets.
The number of global evaluations is auto-scaled based on the number of parameters and observations, or can be set explicitly with global_maxeval.
Examples
Standard FOCE with defaults:
[fit_options]
method = foce
maxiter = 300
covariance = true
FOCEI with global search:
[fit_options]
method = focei
maxiter = 500
covariance = true
global_search = true
SAEM with custom settings:
[fit_options]
method = saem
n_exploration = 200
n_convergence = 300
n_mh_steps = 5
seed = 42
covariance = true
FOCEI with SIR uncertainty:
[fit_options]
method = focei
covariance = true
sir = true
sir_samples = 1000
sir_resamples = 250
sir_seed = 42
Derivative-free BOBYQA fit:
[fit_options]
method = foce
optimizer = bobyqa
maxiter = 300
inner_maxiter = 100
inner_tol = 1e-6
Trust-region with tuned CG subproblem:
[fit_options]
method = foce
optimizer = trust_region
maxiter = 200
steihaug_max_iters = 30
FOCE with Inter-Occasion Variability:
[fit_options]
method = foce
iov_column = OCC
covariance = true
Enable optimizer trace and EBE guard:
[fit_options]
method = foce
optimizer_trace = true
max_unconverged_frac = 0.5
min_obs_for_convergence_check = 3
Optimizer Trace
When optimizer_trace = true, a CSV is written to /tmp/ferx_trace_<pid>_<ts>.csv and the path is stored in FitResult::trace_path. Each row is one outer iteration.
| Column | Populated by | Description |
|---|---|---|
iter | all | Iteration number |
method | all | foce, focei, gn, gn_hybrid, saem |
phase | gn_hybrid, saem | focei (polish) or explore/converge |
ofv | all | Objective function value |
wall_ms | all | Wall time for this iteration (ms) |
grad_norm | BFGS, NLopt gradient-mode | Gradient norm |
step_norm | BFGS | Step size |
inner_iter_count | (reserved) | Reserved for future per-iteration inner-loop count; currently NA |
optimizer | FOCE/FOCEI | Active NLopt algorithm |
lm_lambda | GN | Levenberg-Marquardt damping factor |
ofv_delta | GN | Change in OFV from previous iteration |
step_accepted | GN | Whether the GN step was accepted |
cond_nll | SAEM | Conditional observation NLL |
gamma | SAEM | SAEM step-size (1 in exploration, 1/k in convergence) |
mh_accept_rate | SAEM | Mean Metropolis-Hastings acceptance rate across subjects |
n_ebe_unconverged | FOCE/FOCEI | Subjects whose inner optimizer did not converge |
n_ebe_fallback | FOCE/FOCEI | Subjects that fell back to Nelder-Mead |
Unused columns contain NA. The trace is buffered and flushed when the fit ends; if a run is killed mid-iteration the buffered tail may be lost.
Simulation
The optional [simulation] block defines a virtual clinical trial design for generating simulated data. This is useful for model validation and testing.
Syntax
[simulation]
subjects = N
dose = AMOUNT
cmt = COMPARTMENT
seed = SEED
times = [t1, t2, t3, ...]
| Key | Description |
|---|---|
subjects | Number of virtual subjects to simulate |
dose | Dose amount administered to each subject |
cmt | Dosing compartment (1-indexed) |
seed | Random seed for reproducible simulations |
times | Observation time points |
Example
[simulation]
subjects = 100
dose = 100
cmt = 1
seed = 12345
times = [0.5, 1, 2, 4, 8, 12, 24]
This simulates 100 subjects, each receiving a dose of 100 units into compartment 1, with observations at 0.5, 1, 2, 4, 8, 12, and 24 hours.
Usage
Run a simulation-estimation study from the CLI:
ferx model.ferx --simulate
This will:
- Parse the model and
[simulation]block - Generate simulated data using the model's default parameters and random effects
- Fit the model to the simulated data
- Report parameter estimates and diagnostics
Simulation Process
- For each subject, random effects are sampled from \( \eta_i \sim N(0, \Omega) \)
- Individual parameters are computed using the
[individual_parameters]equations - Predictions are generated using the structural model
- Residual error is added: \( DV = IPRED + \epsilon \), where \( \epsilon \sim N(0, V) \)
- Observations below 0.001 are clipped to 0.001
Covariates in Simulation
Covariates can be specified per subject in the [simulation] block. Currently, the simulated population uses default covariate values (covariates are not randomized).
Estimation Methods
FeRx implements two families of estimation methods for nonlinear mixed effects models:
-
FOCE / FOCEI -- First-Order Conditional Estimation (with or without interaction). The workhorse of population PK, using nested optimization to find maximum likelihood estimates.
-
Gauss-Newton (BHHH) -- A fast alternative that exploits the nonlinear-least-squares structure of the FOCE objective. Converges in 10-30 iterations using the outer product of per-subject gradients as an approximate Hessian. Available as pure GN or a GN+FOCEI hybrid.
-
SAEM -- Stochastic Approximation Expectation-Maximization. Uses MCMC sampling for random effects, providing more robust convergence on complex models.
-
SIR -- Sampling Importance Resampling. An optional post-estimation step that provides non-parametric parameter uncertainty estimates (95% CIs), more robust than the asymptotic covariance matrix.
Quick Comparison
| Feature | FOCE/FOCEI | Gauss-Newton | SAEM |
|---|---|---|---|
| Random effect estimation | MAP (optimization) | MAP (optimization) | MCMC (sampling) |
| Convergence speed | Medium (~50-100 evals) | Fast (~10-30 iterations) | Slower (~400 iterations) |
| Local minima robustness | Can get stuck | Can get stuck | More robust |
| Gradient required | Yes (AD or FD) | Yes (FD, per-subject) | No (for E-step) |
| Stochastic | No | No | Yes |
| Best for | General use | Fast iteration, well-conditioned models | Complex models, many random effects |
Choosing a Method
Start with FOCE for standard 1- or 2-compartment models with 2-4 random effects. It is deterministic and well-understood.
Try Gauss-Newton (gn or gn_hybrid) when you want faster convergence during model development, or when FOCE is slow to converge.
Switch to SAEM when:
- FOCE fails to converge or produces implausible estimates
- The model has many random effects (>4)
- You suspect the FOCE solution is a local minimum
- The model has complex nonlinear random effect structure
All methods produce comparable results on well-behaved models. The final OFV from SAEM and GN is computed using the FOCE approximation, so AIC/BIC values are directly comparable.
FOCE / FOCEI
First-Order Conditional Estimation (FOCE) and its interaction variant (FOCEI) are the primary estimation methods in FeRx. They use a two-level nested optimization to find maximum likelihood estimates of population parameters.
Algorithm Overview
Outer Loop (Population Parameters)
The outer loop optimizes the population parameters \( \theta \), \( \Omega \), and \( \sigma \) by minimizing the population objective function value (OFV):
\[ \text{OFV} = -2 \log L = \sum_{i=1}^{N} \text{OFV}_i \]
Parameters are internally transformed for unconstrained optimization:
- Theta: Log-transformed (ensures positivity)
- Omega: Cholesky factorization (ensures positive-definiteness)
- Sigma: Log-transformed (ensures positivity)
Inner Loop (Empirical Bayes Estimates)
For each subject, the inner loop finds the empirical Bayes estimate (EBE) of the random effects \( \hat{\eta}_i \) by minimizing the individual negative log-likelihood:
\[ -\log p(\eta_i | y_i, \theta, \Omega, \sigma) = \frac{1}{2} \left[ \eta_i^T \Omega^{-1} \eta_i + \log|\Omega| + \sum_j \left( \frac{(y_{ij} - f_{ij})^2}{V_{ij}} + \log V_{ij} \right) \right] \]
The inner loop uses BFGS with automatic differentiation gradients, falling back to Nelder-Mead simplex if BFGS fails.
FOCE vs FOCEI
Standard FOCE
Uses linearized predictions around the EBEs:
\[ f_0 = f(\hat{\eta}) - H \hat{\eta} \]
where \( H \) is the Jacobian matrix \( \partial f / \partial \eta \). The per-subject objective is:
\[ \text{OFV}_i = (y - f_0)^T \tilde{R}^{-1} (y - f_0) + \log|\tilde{R}| \]
where \( \tilde{R} = H \Omega H^T + R(f_0) \).
FOCEI (Interaction)
Uses individual predictions directly without linearization:
\[ \text{OFV}_i = (y - \hat{f})^T V^{-1} (y - \hat{f}) + \hat{\eta}^T \Omega^{-1} \hat{\eta} + \log|\tilde{R}| \]
FOCEI is more accurate when the residual variance depends on the predicted value (proportional or combined error models), because it captures the interaction between random effects and residual error.
Optimizer Options
NLopt Algorithms (Recommended)
| Algorithm | Key | Description |
|---|---|---|
| SLSQP | slsqp | Sequential Least Squares Programming. Handles bounds well. Default and recommended. |
| L-BFGS | nlopt_lbfgs | Limited-memory BFGS. Good for large parameter spaces. |
| MMA | mma | Method of Moving Asymptotes. Alternative constrained optimizer. |
| BOBYQA | bobyqa | Derivative-free trust-region via quadratic interpolation. Useful when FD gradients are unreliable (e.g. noisy FOCE surface). |
Built-in Algorithms
| Algorithm | Key | Description |
|---|---|---|
| BFGS | bfgs | Quasi-Newton with backtracking line search. |
| L-BFGS | lbfgs | Memory-efficient BFGS variant. |
Newton Trust-Region (argmin)
| Algorithm | Key | Description |
|---|---|---|
| Trust region | trust_region | Newton trust-region with Steihaug conjugate-gradient subproblem. Uses a finite-difference Hessian of the OFV-at-fixed-EBEs. Tune the CG budget with steihaug_max_iters (default 50). |
Global Search
Enable gradient-free pre-search to help escape local minima:
[fit_options]
method = foce
global_search = true
global_maxeval = 2000
The pre-search uses NLopt's CRS2-LM algorithm (Controlled Random Search) to explore the parameter space before handing off to the local optimizer. The number of evaluations auto-scales with model complexity if global_maxeval is not set.
Covariance Step
When covariance = true, FeRx computes the variance-covariance matrix of the parameter estimates using a finite-difference Hessian at the converged solution. This provides:
- Standard errors (SE) for all parameters
- Relative standard errors (%RSE) for assessing estimation precision
- Omega SEs via delta method from the Cholesky parameterization
Convergence
The outer loop terminates when any of:
- The gradient norm falls below
outer_gtol(default 1e-6) - The maximum number of iterations (
maxiter) is reached - The optimizer reports convergence (NLopt
XtolReachedorFtolReached)
The inner loop terminates when the gradient norm falls below inner_tol (default 1e-8) or inner_maxiter (default 200) iterations are reached.
Warm Starting
The inner loop warm-starts EBE estimation from the previous outer iteration's EBEs. This significantly reduces computation time, especially in later iterations when parameters change slowly.
Configuration Example
[fit_options]
method = focei
maxiter = 500
covariance = true
optimizer = slsqp
Gauss-Newton (BHHH)
The Gauss-Newton optimizer is an alternative to the standard FOCE/FOCEI approach that exploits the nonlinear-least-squares structure of the FOCE objective. It uses the BHHH (Berndt-Hall-Hall-Hausman) approximation to the Hessian and typically converges in 10-30 iterations, compared to 100+ evaluations for gradient-based methods like SLSQP.
This approach mirrors NONMEM's modified Gauss-Newton algorithm.
Variants
| Method | Key | Description |
|---|---|---|
| Pure Gauss-Newton | gn | Fast convergence for well-conditioned problems |
| GN + FOCEI hybrid | gn_hybrid | Runs GN first, then polishes with FOCEI for robustness |
Algorithm Overview
BHHH Hessian Approximation
The FOCE objective has the form:
\[ \text{OFV} = 2 \sum_{i=1}^{N} \text{NLL}_i \]
The BHHH approximation uses the outer product of per-subject gradients as an approximate Hessian:
\[ H_{\text{BHHH}} = 4 \sum_{i=1}^{N} g_i , g_i^T \]
where \( g_i = \nabla_x \text{NLL}_i \) is the gradient of the negative log-likelihood for subject \( i \) with respect to the packed parameter vector. Per-subject gradients are computed via central finite differences.
This approximation is equivalent to the Gauss-Newton Hessian for log-likelihood objectives and is guaranteed positive semi-definite.
Levenberg-Marquardt Damping
To improve stability away from the minimum, the Hessian is regularized:
\[ H_{\text{LM}} = H_{\text{BHHH}} + \lambda , \text{diag}(H_{\text{BHHH}}) \]
The damping factor \( \lambda \) is adapted during optimization:
- On successful step (OFV decreases): \( \lambda \leftarrow 0.3 , \lambda \) (minimum \( 10^{-6} \))
- On rejected step (OFV increases): \( \lambda \leftarrow 10 , \lambda \)
- If \( \lambda > 10^6 \), the algorithm stops
The Newton step is obtained by solving:
\[ H_{\text{LM}} , \delta = -\nabla \text{OFV} \]
via Cholesky decomposition. If the Cholesky fails (singular Hessian), \( \lambda \) is increased and the iteration is retried.
Line Search
Each step uses backtracking line search (up to 15 halvings). At each trial point, EBEs are re-estimated via the inner loop (warm-started from the previous EBEs), ensuring the FOCE objective is evaluated consistently.
Convergence
The algorithm converges when:
- Relative OFV change \( < 10^{-6} \) for at least 3 iterations, or
- Maximum iterations are reached
Hybrid Mode (GN + FOCEI)
When method = gn_hybrid, the algorithm runs in two phases:
- GN phase: Run Gauss-Newton iterations to quickly find the basin of the minimum
- FOCEI polish: Run up to 100 iterations of standard FOCEI optimization (via NLopt SLSQP) warm-started from the GN result
The final result is whichever phase produced the lower OFV. This combines the fast convergence of Gauss-Newton with the refined accuracy of FOCEI.
Configuration
[fit_options]
method = gn # or gn_hybrid
maxiter = 100 # max GN iterations
covariance = true
The initial Levenberg-Marquardt damping factor defaults to 0.01 and adapts automatically.
When to Use
Use gn when:
- You want fast convergence and the model is well-conditioned
- You are iterating on model development and need quick feedback
Use gn_hybrid when:
- You want robustness against local minima
- The model is complex (many parameters or random effects)
- You want the speed of GN with a safety net
Stick with foce/focei when:
- The model has convergence issues with GN (rare)
- You need exact compatibility with standard FOCE/FOCEI results
SAEM
Stochastic Approximation Expectation-Maximization (SAEM) is an alternative estimation method that uses MCMC sampling for random effects instead of MAP optimization. It is more robust to local minima and can handle complex random effect structures.
Algorithm Overview
SAEM replaces the deterministic inner loop of FOCE with stochastic sampling, following the Monolix convention with a two-phase step-size schedule.
References
- Delyon, Lavielle, Moulines (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, 94--128.
- Kuhn & Lavielle (2004). Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: Probability and Statistics 8:115--131.
Two-Phase Schedule
Phase 1: Exploration (iterations 1 to K1)
Step size \( \gamma_k = 1 \). The algorithm explores the parameter space rapidly, with the sufficient statistics being fully replaced each iteration. This allows fast movement toward the basin of the MLE.
Default: 150 iterations.
Phase 2: Convergence (iterations K1+1 to K1+K2)
Step size \( \gamma_k = 1/(k - K_1) \). The algorithm performs a decreasing-weight average, which guarantees almost-sure convergence to the MLE under regularity conditions.
Default: 250 iterations.
Per-Iteration Steps
Each SAEM iteration consists of:
1. E-Step: Metropolis-Hastings Sampling
For each subject, run n_mh_steps Metropolis-Hastings iterations to sample from the conditional distribution of random effects:
\[ p(\eta_i | y_i, \theta, \Omega, \sigma) \]
Proposal: symmetric random walk in deviation space, \( \eta_{\text{prop}} = \eta_{\text{current}} + \delta_i \cdot L \cdot z \), with \( z \sim N(0, I) \) and \( L = \text{chol}(\Omega) \). The schedule is identical across both phases — only the SA step size \( \gamma_k \) changes between exploration and convergence.
The MH kernel is symmetric in \( \eta \), so the proposal density cancels and the acceptance log-ratio is the difference of individual_nll values, which encodes the prior \( N(0, \Omega) \) plus the observation likelihood.
Acceptance: Accept with probability \( \min(1, \exp(\text{NLL}{\text{current}} - \text{NLL}{\text{prop}})) \).
The MH sampling is parallelized across subjects using Rayon.
2. Stochastic Approximation Update
Update the sufficient statistic for \( \Omega \):
\[ S_2 \leftarrow (1 - \gamma_k) \cdot S_2 + \gamma_k \cdot \frac{1}{N} \sum_{i=1}^{N} \eta_i \eta_i^T \]
3. M-Step for Omega (Closed Form)
\[ \Omega_k = S_2 \]
with structurally-zero entries (cross-block off-diagonals, standalone-vs-block off-diagonals, and all off-diagonals in a fully-diagonal Ω) zeroed out — the SA accumulator \( (1/N) \sum_i \eta_i \eta_i^T \) is dense by construction, but the model declares which entries are free parameters. Without this projection the chain feeds spurious sampling correlations into the next iteration's MH proposal Cholesky and Ω drifts toward rank-deficiency.
4. M-Step for Theta and Sigma (Optimization)
Minimize the conditional observation negative log-likelihood with ETAs held fixed:
\[ \sum_{i=1}^{N} \sum_{j=1}^{n_i} \left[ \frac{1}{2} \log V_{ij} + \frac{1}{2} \frac{(y_{ij} - f_{ij})^2}{V_{ij}} \right] \]
When mu_referencing = true (the default), ferx detects lognormal parameters from the [individual_parameters] block and applies the closed-form EM update for those thetas instead of running NLopt:
\[ \log \theta_j \leftarrow \log \theta_j + \gamma_k \cdot \overline{\eta_j} \]
where \( \overline{\eta_j} = (1/N) \sum_i \eta_{i,j} \) is the empirical mean of the post-MH random effects for the eta paired with \( \theta_j \). For a log-mu-referenced model where \( \log P_i = \log \theta + \eta_i \) with \( \eta_i \sim N(0, \omega^2) \), this is exactly the M-step that maximises the complete-data log-likelihood, scaled by the SA step size \( \gamma_k \). After the update the etas are re-centred by the same shift so they remain deviations from the new \( \log \theta_j \).
NLopt still runs for any remaining thetas (non-mu-referenced) and for sigma — the closed-form-updated thetas are pinned at their new values for the NLopt call.
When mu_referencing = false, the full NLopt M-step runs for all thetas as before.
The number of NLopt evaluations saved is stored in FitResult::saem_mu_ref_m_step_evals_saved, accumulated across SAEM iterations as 2 × mstep_maxiter × n_mu_ref_pairs per outer step (one finite-difference probe pair per pinned mu-ref dimension, capped at mstep_maxiter NLopt gradient requests). The field is None when mu-referencing is off or method ≠ SAEM.
5. Adaptive MH Step Sizes
Every adapt_interval iterations, the per-subject step sizes \( \delta_i \) are adjusted:
- If acceptance rate > 40%: increase \( \delta_i \) by 10% (up to 5.0)
- If acceptance rate < 40%: decrease \( \delta_i \) by 10% (down to 0.01)
This targets an acceptance rate around 40%, balancing exploration and mixing.
Post-SAEM Finalization
After the SAEM iterations complete:
- EBE Refinement: Run the standard FOCE inner loop (BFGS optimization) warm-started from the SAEM ETAs to obtain final empirical Bayes estimates
- FOCE OFV: Compute the objective function using the FOCE/Laplace approximation, so AIC and BIC are directly comparable with FOCE results
- Covariance Step: Optionally compute standard errors via finite-difference Hessian (same method as FOCE)
- Diagnostics: Compute PRED, IPRED, CWRES, IWRES for each subject
Configuration
[fit_options]
method = saem
n_exploration = 150 # Phase 1 iterations
n_convergence = 250 # Phase 2 iterations
n_mh_steps = 3 # MH steps per subject per iteration
adapt_interval = 50 # Step-size adaptation frequency
seed = 12345 # RNG seed for reproducibility
covariance = true # Compute standard errors
Tuning Guide
Not Converging
- Increase
n_exploration(e.g., 300) to give more time for basin finding - Increase
n_convergence(e.g., 500) for a longer averaging window - Increase
n_mh_steps(e.g., 5-10) for better mixing in the E-step
Slow Convergence
- Decrease
n_explorationandn_convergenceif parameters stabilize early - Use
adapt_interval = 25for faster step-size adaptation
Reproducibility
- Always set
seedfor reproducible results - Different seeds will produce slightly different estimates due to the stochastic nature of the algorithm
Output
The SAEM iteration progress is printed to stderr:
SAEM: 10 subjects, 3 ETAs, 400 total iter (150 explore + 250 converge)
SAEM iter 1/400 [explore] gamma=1.000 condNLL=95.244
SAEM iter 50/400 [explore] gamma=1.000 condNLL=56.705
SAEM iter 150/400 [explore] gamma=1.000 condNLL=46.071
SAEM iter 200/400 [converge] gamma=0.020 condNLL=36.799
SAEM iter 400/400 [converge] gamma=0.004 condNLL=38.096
SAEM iterations complete. Computing final EBEs and OFV...
The condNLL is the conditional observation negative log-likelihood (not the final OFV). It should generally decrease during the exploration phase and stabilize during convergence.
Sampling Importance Resampling (SIR)
SIR is an optional post-estimation step that provides non-parametric parameter uncertainty estimates. It produces 95% confidence intervals that are more robust than the asymptotic covariance matrix, particularly for models with:
- Non-normal parameter distributions
- Boundary estimates (parameters near constraints)
- Small datasets where asymptotic assumptions may not hold
How It Works
SIR uses the maximum likelihood estimates and their covariance matrix as a proposal distribution, then reweights samples based on the actual likelihood:
- Sample: Draw M parameter vectors from a multivariate normal distribution centered on the ML estimates, using the estimation covariance matrix
- Importance weighting: For each sample, compute the objective function value (OFV) and calculate an importance weight based on the ratio of the true likelihood to the proposal density
- Resample: Draw m vectors (with replacement) proportional to the importance weights
The resampled vectors approximate the true parameter uncertainty distribution. Confidence intervals are derived from their empirical percentiles.
Enabling SIR
Add sir = true to the [fit_options] block. The covariance step must also be enabled (it provides the proposal distribution):
[fit_options]
method = focei
covariance = true
sir = true
Options
| Key | Default | Description |
|---|---|---|
sir | false | Enable/disable SIR |
sir_samples | 1000 | Number of proposal samples (M). Higher values give more reliable weights but take longer |
sir_resamples | 250 | Number of resampled vectors (m). Must be less than sir_samples |
sir_seed | 12345 | RNG seed for reproducibility |
Output
SIR adds the following to the estimation output:
- 95% CI for each theta, omega, and sigma parameter (2.5th and 97.5th percentiles)
- Effective sample size (ESS): a diagnostic indicating how well the proposal distribution matches the true uncertainty. ESS close to M indicates a good match; ESS much less than m suggests the proposal is a poor fit
Diagnostics
The effective sample size (ESS) is the primary diagnostic:
- ESS > m (resamples): excellent — the proposal distribution is well-matched
- ESS between 100 and m: adequate for most purposes
- ESS < 100: the proposal may be a poor fit; consider a different estimation method or increasing
sir_samples
Computational Cost
SIR evaluates the inner loop (EBE optimization) for each of the M proposal samples. With the default M=1000, this is roughly 3-10x the cost of the estimation step itself. The computation is parallelized across samples and warm-started from the ML EBEs to minimize runtime.
The resampling step itself is negligible.
Example
[fit_options]
method = focei
covariance = true
sir = true
sir_samples = 1000
sir_resamples = 250
sir_seed = 42
Reference
Dosne A-G, Bergstrand M, Karlsson MO. "Improving the estimation of parameter uncertainty distributions in nonlinear mixed effects models using sampling importance resampling." J Pharmacokinet Pharmacodyn. 2017;44(6):539-562. doi:10.1007/s10928-017-9542-0
Inter-Occasion Variability (IOV)
Inter-Occasion Variability (IOV) models the fact that a subject's PK parameters may shift between occasions (treatment periods, study visits, dosing intervals). Unlike between-subject variability (BSV), which is fixed for a subject across the whole study, IOV is a random effect that is re-drawn for each occasion.
Concepts
Occasions
An occasion is a distinct time window during which the same kappa value applies. Occasions are identified by an integer column in the dataset (e.g. OCC). All rows — dose records and observation records — that share the same occasion index belong to one occasion.
Kappa parameters
Kappa (κ) is the IOV random effect, analogous to eta (η) for BSV. Each kappa is drawn independently per occasion:
\[ \kappa_{ik} \sim \mathcal{N}(0, \Omega_\text{IOV}) \]
where i indexes subjects and k indexes occasions. The IOV omega matrix Ω_IOV is estimated alongside the BSV omega.
Individual parameters
Kappas enter the individual-parameter expressions in the same way as etas:
CL = TVCL * exp(ETA_CL + KAPPA_CL)
At each occasion k, the effective individual CL is TVCL * exp(ETA_CL_i + KAPPA_CL_ik). The BSV eta captures stable between-subject differences; the kappa captures occasion-to-occasion fluctuation within a subject.
Option A: Diagonal IOV (independent kappas)
Each kappa is declared independently, giving a diagonal Ω_IOV:
[parameters]
...
kappa KAPPA_CL ~ 0.05
kappa KAPPA_V ~ 0.03
The two kappas are uncorrelated across occasions. This is the most common formulation and corresponds to NONMEM's IOV Option A.
Use FIX to hold a kappa variance constant during estimation:
kappa KAPPA_CL ~ 0.05 FIX
Option B: Block IOV (correlated kappas)
When occasion effects on different parameters are expected to covary, use block_kappa:
[parameters]
...
block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]
Values are the lower triangle of Ω_IOV: Var(KAPPA_CL), Cov(KAPPA_CL, KAPPA_V), Var(KAPPA_V). This mirrors the block_omega syntax for BSV.
Mixed Option A + B
kappa and block_kappa declarations can be combined freely — uncorrelated kappas can coexist with a correlated block:
block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]
kappa KAPPA_KA ~ 0.10
A name may not appear in both a kappa and a block_kappa line.
Model File Setup
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09 # BSV
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
kappa KAPPA_CL ~ 0.05 # IOV (Option A)
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL + KAPPA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
iov_column = OCC
See examples/warfarin_iov.ferx for a complete runnable example.
Algorithm Details
Inner loop
When a model has kappa declarations and the subject has occasion labels, the inner optimizer runs find_ebe_iov instead of the standard find_ebe. It jointly optimizes over:
\[ p = [\underbrace{\eta_1, \ldots, \eta_{n_\eta}}{\text{BSV}},\ \underbrace{\kappa{1,1}, \ldots, \kappa_{1,n_\kappa}}{\text{occasion 1}},\ \ldots,\ \underbrace{\kappa{K,1}, \ldots, \kappa_{K,n_\kappa}}_{\text{occasion K}}] \]
The joint negative log-posterior is:
\[ -\log p(p \mid y_i) = \frac{1}{2}\left[ \eta^T \Omega^{-1} \eta + \log|\Omega|
- \sum_{k=1}^{K} \kappa_k^T \Omega_\text{IOV}^{-1} \kappa_k
- K \log|\Omega_\text{IOV}|
- \sum_{j} \left(\frac{(y_{ij} - f_{ij}^{(k_j)})^2}{V_{ij}} + \log V_{ij}\right) \right] \]
where \( k_j \) is the occasion of observation j and \( f_{ij}^{(k)} = f(\theta, \eta_i, \kappa_{ik}) \).
Optimization uses BFGS; the gradient is computed by finite differences (no AD path for IOV).
Outer loop
The IOV omega is packed after the sigma block in the optimizer vector. For Option A (diagonal Ω_IOV) only the log-diagonal entries are appended; for Option B the full Cholesky lower triangle is appended (log-diagonals plus off-diagonals as-is), mirroring the BSV omega packing:
\[ x = [\log\theta,\ \text{chol}(\Omega_\text{BSV}),\ \log\sigma,\ \text{chol}(\Omega_\text{IOV})] \]
The per-subject FOCE objective uses a block-diagonal omega that stacks the BSV block with K copies of Ω_IOV:
\[ \Omega_\text{full} = \text{blockdiag}(\Omega_\text{BSV},\ \Omega_\text{IOV},\ \ldots,\ \Omega_\text{IOV}) \]
The H-matrix (Jacobian ∂f/∂η) covers BSV etas only — kappa columns are not included in the FOCE linearization.
Covariance step
When covariance = true, standard errors for the IOV omega diagonal (or its Cholesky elements) are computed via the finite-difference Hessian at convergence, using the same procedure as for BSV omega.
Output
The following IOV-related fields are populated on FitResult and printed by print_results(). The fit YAML gains an omega_iov: section listing the kappa variances and (for Option B) the off-diagonal covariances/correlations.
| Field | Description |
|---|---|
omega_iov | Estimated IOV covariance matrix (Option<DMatrix<f64>>) |
kappa_names | Names of kappa parameters in declaration order |
kappa_fixed | Per-kappa FIX flags |
se_kappa | Standard errors for IOV omega elements (requires covariance = true) |
ebe_kappas | Per-subject, per-occasion kappa EBEs. ebe_kappas[i][k] is the kappa vector for subject i, occasion k |
When the dataset has occasion labels the sdtab CSV gains an OCC column (one row per observation, the subject's occasion index for that row).
Note:
shrinkage_kappaexists as a placeholder field onFitResultbut is not yet computed — it is always returned as an emptyVec. Useebe_kappasdirectly if you need a manual shrinkage diagnostic.
Limitations
- SAEM is not supported with IOV models — an error is returned directing you to use
foceorfocei. - Automatic differentiation (AD) is not used for IOV inner-loop gradients; finite differences are used instead. For large models this is slower than the AD path used for BSV-only models.
- The occasion column must contain non-negative integers. Non-integer or negative values are silently treated as missing (
OCC = 0) and a single summary warning is emitted. - Per-kappa shrinkage is not yet computed (see Output note above).
- Kappa EBEs are not yet emitted as columns in the sdtab CSV; access them via
FitResult.ebe_kappas.
Comparison to NONMEM
| Concept | NONMEM | FeRx |
|---|---|---|
| Diagonal IOV | OMEGA BLOCK(n) SAME with one kappa per block | kappa NAME ~ variance (Option A) |
| Correlated IOV | OMEGA BLOCK(n) shared across occasions | block_kappa (N1, N2) = [...] (Option B) |
| Occasion column | OCC in $INPUT; referenced in $PK | iov_column = OCC in [fit_options] |
| Kappa in PK | ETA(n) with SAME block | KAPPA_xxx in [individual_parameters] |
Data Format
FeRx reads data in NONMEM-compatible CSV format. This is the standard format used across population PK tools.
Required Columns
| Column | Type | Description |
|---|---|---|
ID | string/numeric | Subject identifier |
TIME | numeric | Time relative to first dose |
DV | numeric | Dependent variable (observed concentration) |
Optional Standard Columns
| Column | Type | Default | Description |
|---|---|---|---|
EVID | integer | 0 | Event ID: 0 = observation, 1 = dose, 4 = reset + dose |
AMT | numeric | 0 | Dose amount (only for EVID=1 or EVID=4) |
CMT | integer | 1 | Compartment number (1-indexed) |
RATE | numeric | 0 | Infusion rate. 0 = bolus dose |
MDV | integer | 0 | Missing DV flag. 1 = DV should be ignored |
II | numeric | 0 | Interdose interval for repeated dosing |
SS | integer | 0 | Steady-state flag. 1 = assume steady state |
CENS | integer | 0 | Censoring flag. 1 = observation is below LLOQ; DV carries the LLOQ value. Paired with bloq_method = m3 in [fit_options] to enable likelihood-based handling — see BLOQ example. |
Occasion Column (IOV)
When using Inter-Occasion Variability (IOV), add an occasion-index column to the dataset and specify its name with iov_column in [fit_options]. The column:
- Contains integer occasion indices (e.g. 1, 2, 3…) — one per row
- Applies to both dose rows and observation rows
- Is excluded from covariate auto-detection
Example dataset with OCC column:
ID,TIME,DV,EVID,AMT,CMT,MDV,OCC
1,0,.,1,100,1,1,1
1,1,9.5,0,.,.,0,1
1,2,7.3,0,.,.,0,1
1,24,.,1,100,1,1,2
1,25,10.1,0,.,.,0,2
1,26,8.2,0,.,.,0,2
The occasion index can be any positive integer; they do not need to start at 1 or be consecutive, but a different number means a different occasion with its own kappa EBE.
See IOV documentation for full details.
Covariate Columns
Any column not in the standard set above is automatically treated as a covariate. Covariate values are:
- Time-constant: The first non-missing value for each subject is used
- Time-varying: If values change over time for a subject, Last Observation Carried Forward (LOCF) is applied per event (NONMEM-equivalent:
[individual_parameters]is re-evaluated at each dose and observation row using that row's covariate values)
Covariate names in the data file are matched case-insensitively to names used in [individual_parameters] expressions.
Time-varying covariate scope
Time-varying covariates are supported for all analytical structural models and ODE-defined models:
- 1-compartment IV bolus (
one_cpt_iv_bolus) - 1-compartment infusion (
one_cpt_infusion) - 1-compartment oral (
one_cpt_oral) - 2-compartment IV bolus (
two_cpt_iv_bolus) - 2-compartment infusion (
two_cpt_infusion) - 2-compartment oral (
two_cpt_oral) - 3-compartment IV bolus (
three_cpt_iv_bolus) - 3-compartment infusion (
three_cpt_infusion) - 3-compartment oral (
three_cpt_oral) - All ODE-defined models (via
[odes])
For oral models, the bolus dose into compartment 1 is interpreted as the depot (NONMEM ADVAN2/ADVAN4/ADVAN12 convention) and observation read-out reads the central compartment.
The autodiff (Enzyme) gradient fast path is also event-driven for all analytical models — TV-cov subjects keep AD-accelerated gradients and an AD-accelerated H-matrix Jacobian (forward-mode), so neither the inner-loop gradient nor the per-iteration Jacobian falls back to finite-differences.
Infusion routing on the event-driven path:
- IV models: central infusion (cmt=1) for all 1/2/3-cpt; peripheral infusion for 2-cpt (cmt=2) and 3-cpt (cmt=2 → periph1, cmt=3 → periph2). Steady-state amounts per channel are computed by linear superposition over the channels.
- Oral models: central infusion (cmt=2) is supported; peripheral infusion is rare clinically and still panics with a clear message (tracked as a follow-up).
Event Types (EVID)
| EVID | Meaning |
|---|---|
| 0 | Observation record. DV is used for estimation. |
| 1 | Dosing record. AMT is administered to compartment CMT. |
| 2 | Other event (typically a covariate-change marker). The compartment state is unchanged but the rate matrix is refreshed from this row's covariate values — matching NONMEM's $PK runs at every record semantic. Only meaningful when at least one covariate is time-varying; for time-constant data EVID=2 rows are skipped (would be no-ops). |
| 4 | Reset and dose. All compartment amounts are reset to zero before dosing. |
Example Dataset
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,WT,CRCL
1,0,.,1,100,1,0,1,70,95
1,0.5,9.49,0,.,.,.,0,70,95
1,1,14.42,0,.,.,.,0,70,95
1,2,17.56,0,.,.,.,0,70,95
1,4,15.23,0,.,.,.,0,70,95
1,8,10.15,0,.,.,.,0,70,95
2,0,.,1,150,1,0,1,85,110
2,0.5,14.2,0,.,.,.,0,85,110
2,1,21.3,0,.,.,.,0,85,110
Key points:
- Dose records (
EVID=1) haveMDV=1andDV=.(missing) - Observation records (
EVID=0) haveMDV=0and a validDV - Covariates (
WT,CRCL) are included as extra columns - Missing values can be represented as
.or left empty
Infusion Doses
For IV infusions, set RATE to the infusion rate (amount per time unit):
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV
1,0,.,1,500,1,50,1
This administers 500 units at a rate of 50 units/hour (duration = 10 hours).
Steady-State Dosing
For steady-state simulations, set SS=1 and II to the dosing interval:
ID,TIME,DV,EVID,AMT,CMT,SS,II,MDV
1,0,.,1,100,1,1,12,1
1,0.5,25.3,0,.,.,.,.,0
This assumes the subject has reached steady state with 100 units every 12 hours before the observation at TIME=0.5.
Multiple Doses
Multiple doses are supported as separate rows:
ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,12,.,1,100,1,1
1,12.5,15.2,0,.,.,0
1,24,.,1,100,1,1
1,24.5,18.1,0,.,.,0
Column Name Case
Column names are case-insensitive. ID, Id, and id are all recognized. Covariate columns preserve their case as declared in the CSV header.
CLI Reference
The ferx command-line tool runs population PK estimation from model files and data.
Usage
ferx <model.ferx> --data <data.csv>
ferx <model.ferx> --simulate
Commands
Fit with Data
ferx model.ferx --data data.csv
Parses the model file, reads the data, and runs the estimation method specified in [fit_options] (defaults to FOCE).
Simulate and Fit
ferx model.ferx --simulate
Parses the model file, generates simulated data from the [simulation] block, and fits the model to the simulated data. Requires a [simulation] block in the model file.
Output Files
Three files are generated, named after the model file:
| File | Contents |
|---|---|
{model}-sdtab.csv | Per-observation diagnostics |
{model}-fit.yaml | Parameter estimates and standard errors |
{model}-timing.txt | Wall-clock estimation time |
See Output Files for detailed format descriptions.
Console Output
Progress
The estimation progress is printed to stderr, including:
- Model and data summary (subjects, observations, parameters)
- Optimizer iterations with OFV values (FOCE) or condNLL values (SAEM)
- Covariance step status
- Final parameter table
Result Summary
A brief summary is printed to stdout:
Fit completed!
OFV: -280.1838
Elapsed: 0.496s
TVCL = 0.132735
TVV = 7.694842
TVKA = 0.757498
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Error (parse failure, data error, convergence failure) |
Examples
# One-compartment oral warfarin model
ferx examples/warfarin.ferx --data data/warfarin.csv
# Two-compartment IV with FOCE
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv
# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv
# Simulation-estimation study
ferx examples/warfarin.ferx --simulate
Building
# Build the ferx binary
cargo build --release --features autodiff
# Run directly via cargo
cargo run --release --features autodiff --bin ferx -- model.ferx --data data.csv
Output Files
Each model run produces three output files.
sdtab CSV ({model}-sdtab.csv)
A CSV file with per-observation diagnostics, one row per observation per subject.
Columns
| Column | Description |
|---|---|
ID | Subject identifier |
TIME | Observation time |
DV | Observed value |
PRED | Population prediction (eta = 0) |
IPRED | Individual prediction (eta = EBE) |
CWRES | Conditional weighted residual |
IWRES | Individual weighted residual |
ETA1, ETA2, ... | Empirical Bayes estimates of random effects |
EBE_OFV | Each subject's contribution to the total OFV |
N_OBS | Number of observations for the subject |
Residual Definitions
IWRES (Individual Weighted Residual): \[ \text{IWRES}_j = \frac{y_j - \text{IPRED}_j}{\sqrt{V_j}} \] where \( V_j \) is the residual variance evaluated at the individual prediction.
CWRES (Conditional Weighted Residual): \[ \text{CWRES}j = \frac{y_j - f{0,j}}{\sqrt{\tilde{R}_{jj}}} \] where \( f_0 = f(\hat{\eta}) - H\hat{\eta} \) is the linearized population prediction and \( \tilde{R} = H\Omega H^T + R \) is the conditional variance.
Example
ID,TIME,DV,PRED,IPRED,CWRES,IWRES,ETA1,ETA2,ETA3
1,0.5,9.49,10.12,9.55,-0.23,-0.06,0.15,-0.08,0.32
1,1.0,14.42,14.87,14.35,0.18,0.05,0.15,-0.08,0.32
Fit YAML ({model}-fit.yaml)
A YAML file containing parameter estimates, standard errors, and model diagnostics.
Structure
model:
converged: true
method: FOCE
objective_function:
ofv: -280.1838
aic: -266.1838
bic: -247.2804
data:
n_subjects: 10
n_observations: 110
n_parameters: 7
theta:
TVCL:
estimate: 0.132735
se: 0.014549
rse_pct: 11.0
TVV:
estimate: 7.694842
se: 0.293028
rse_pct: 3.8
omega:
omega_11:
variance: 0.028584
cv_pct: 16.9
se: 0.006394
omega_22:
variance: 0.009613
cv_pct: 9.8
se: 0.002165
sigma:
sigma_1:
estimate: 0.010638
se: 0.000788
Key Fields
- ofv: Objective Function Value (-2 log-likelihood)
- aic: Akaike Information Criterion (OFV + 2p)
- bic: Bayesian Information Criterion (OFV + p * ln(n))
- se: Standard error from the covariance step
- rse_pct: Relative standard error as percentage (SE/estimate * 100)
- cv_pct: Coefficient of variation for omega (sqrt(variance) * 100)
FitResult Fields (Rust API / console)
The following fields are populated on the FitResult struct returned by fit() and printed by print_results(). They are not currently written to the fit YAML — read them programmatically or from the console summary.
Shrinkage
Two shrinkage metrics are reported after every fit:
ETA shrinkage (per random effect — shrinkage_eta: Vec<f64>):
\[ \text{shrinkage}_k = 1 - \frac{\text{SD}(\hat{\eta}k)}{\sqrt{\omega{kk}}} \]
A value near 1 means individual EBEs are all pulled toward zero — the data are not informative about that random effect. A value near 0 means the ETAs are spread consistent with the prior omega.
EPS shrinkage (scalar — shrinkage_eps: f64):
\[ \text{shrinkage}_\varepsilon = 1 - \text{SD}(\text{IWRES}) \]
Values of NaN indicate a zero-variance omega component (ETA) or fewer than two valid residuals (EPS).
Covariance Status
covariance_status: CovarianceStatus takes one of three values:
| Value | Meaning |
|---|---|
Computed | Covariance step succeeded; SE values are valid |
Failed | Hessian was singular or inversion failed; SE fields are None |
NotRequested | covariance = false was set; SE fields are None |
Run Record Fields
| Field | Description |
|---|---|
model_name | Name from the .ferx file (or "Unnamed") |
ferx_version | Version of ferx-nlme that produced the result |
wall_time_secs | Wall-clock time for the complete fit (seconds) |
gradient_method_inner | Gradient method used in the inner (EBE) loop, e.g. FiniteDifference |
gradient_method_outer | Gradient method used in the outer loop, e.g. FiniteDifference or AutoDiff |
uses_ode_solver | true if the model uses the ODE solver, false for analytical PK |
n_threads_used | Number of Rayon threads used during estimation |
nlopt_missing_algorithms | NLopt algorithms that were requested but unavailable in this build (empty when all available) |
covariance_n_evals_estimated | Estimated number of OFV evaluations the covariance step will run, populated only when run_covariance_step = true and n_parameters > 30 |
EBE Convergence Diagnostics
Counters from the inner-loop (EBE) optimizer, useful for diagnosing problematic fits. Always 0 for SAEM (which uses MH sampling rather than EBE optimization).
| Field | Description |
|---|---|
ebe_convergence_warnings | Number of outer iterations in which at least one subject had an unconverged EBE |
max_unconverged_subjects | Worst-case unconverged-subject count seen in a single outer iteration |
total_ebe_fallbacks | Total number of times the Nelder-Mead fallback was invoked across all subjects and outer iterations |
Timing File ({model}-timing.txt)
A single-line text file with the wall-clock estimation time:
elapsed_seconds=0.496000
This measures only the estimation step (not parsing or data reading).
Optimizer Trace CSV
When optimizer_trace = true is set in [fit_options], a CSV is written to /tmp/ferx_trace_<pid>_<ts>.csv. The path is also stored in FitResult::trace_path.
Each row is one outer iteration. See the fit-options trace table for the full column reference.
Example use in R (with the ferx package):
fit <- ferx_fit("model.ferx", "data.csv", optimizer_trace = TRUE)
trace <- read.csv(fit$trace_path)
plot(trace$iter, trace$ofv, type = "l", xlab = "Iteration", ylab = "OFV")
Examples
FeRx includes several example models in the examples/ directory with corresponding datasets in data/.
| Example | Model | Route | Features |
|---|---|---|---|
| One-Compartment Oral | 1-cpt | Oral | Basic PopPK, FOCE and SAEM |
| Two-Compartment IV | 2-cpt | IV bolus | Multi-compartment dynamics |
| Covariates | 2-cpt | Oral | Weight and renal function effects |
| ODE Model | 1-cpt | Oral | Michaelis-Menten elimination, ODE solver |
| BLOQ (M3 method) | 1-cpt | Oral | Likelihood-based handling of censored observations |
Running Examples
# One-compartment oral (warfarin)
ferx examples/warfarin.ferx --data data/warfarin.csv
# Two-compartment IV
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv
# Covariates model
ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv
# ODE model (Michaelis-Menten)
ferx examples/mm_oral.ferx --data data/mm_oral.csv
# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv
# BLOQ (M3 method)
ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv
One-Compartment Oral (Warfarin)
This example fits a one-compartment oral pharmacokinetic model to warfarin concentration data.
Model File (warfarin.ferx)
# One-compartment oral PK model (warfarin)
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
maxiter = 300
covariance = true
Model Description
- Structure: One-compartment model with first-order absorption (KA) and first-order elimination (CL/V)
- Random effects: Log-normal on CL, V, and KA (exponential eta model)
- Error model: Proportional residual error
- Parameters:
- TVCL: Typical value of clearance (L/h)
- TVV: Typical value of volume of distribution (L)
- TVKA: Typical value of absorption rate constant (1/h)
Running
ferx examples/warfarin.ferx --data data/warfarin.csv
Expected Results
--- THETA Estimates ---
TVCL 0.132735
TVV 7.694842
TVKA 0.757498
--- OMEGA Estimates (variances) ---
OMEGA(1,1) = 0.028584 (CV% = 16.9)
OMEGA(2,2) = 0.009613 (CV% = 9.8)
OMEGA(3,3) = 0.340868 (CV% = 58.4)
--- SIGMA Estimates ---
SIGMA(1) = 0.010638
SAEM Variant
The same model can be run with SAEM by changing the [fit_options]:
[fit_options]
method = saem
n_exploration = 150
n_convergence = 250
n_mh_steps = 3
seed = 12345
covariance = true
ferx examples/warfarin_saem.ferx --data data/warfarin.csv
Two-Compartment IV Bolus
This example fits a two-compartment IV bolus model with four random effects.
Model File (two_cpt_iv.ferx)
# Two-compartment IV bolus PK model
[parameters]
theta TVCL(5.0, 0.01, 100.0)
theta TVV1(50.0, 0.1, 1000.0)
theta TVQ(10.0, 0.01, 200.0)
theta TVV2(100.0, 0.1, 5000.0)
omega ETA_CL ~ 0.09
omega ETA_V1 ~ 0.04
omega ETA_Q ~ 0.04
omega ETA_V2 ~ 0.09
sigma PROP_ERR ~ 0.04
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V1 = TVV1 * exp(ETA_V1)
Q = TVQ * exp(ETA_Q)
V2 = TVV2 * exp(ETA_V2)
[structural_model]
pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
maxiter = 500
covariance = true
Model Description
- Structure: Two-compartment model with central (V1) and peripheral (V2) compartments connected by intercompartmental clearance (Q)
- Route: Intravenous bolus (dose goes directly into central compartment)
- Random effects: Log-normal on all four PK parameters
- Parameters:
- CL: Systemic clearance (L/h)
- V1: Central volume of distribution (L)
- Q: Intercompartmental clearance (L/h)
- V2: Peripheral volume of distribution (L)
The bi-exponential concentration profile is characterized by a rapid distribution phase (alpha) followed by a slower elimination phase (beta).
Running
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv
Notes
- Two-compartment models have more parameters and may need more iterations to converge
- The
global_search = trueoption can help if convergence is difficult - Consider reducing random effects (e.g., fixing Q or V2 variability) if the model is overparameterized
Covariate Model
This example demonstrates a two-compartment oral model with body weight (WT) and creatinine clearance (CRCL) as covariates on clearance.
Model File (two_cpt_oral_cov.ferx)
# Two-compartment oral PK model with covariates
[parameters]
theta TVCL(5.0, 0.01, 100.0)
theta TVV1(50.0, 0.1, 1000.0)
theta TVQ(10.0, 0.01, 200.0)
theta TVV2(100.0, 0.1, 5000.0)
theta TVKA(1.0, 0.01, 50.0)
theta THETA_WT(0.75, 0.01, 2.0)
theta THETA_CRCL(0.5, 0.01, 2.0)
omega ETA_CL ~ 0.09
omega ETA_V1 ~ 0.04
omega ETA_Q ~ 0.04
omega ETA_V2 ~ 0.09
omega ETA_KA ~ 0.25
sigma PROP_ERR ~ 0.04
[individual_parameters]
CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
V1 = TVV1 * exp(ETA_V1)
Q = TVQ * exp(ETA_Q)
V2 = TVV2 * exp(ETA_V2)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = focei
maxiter = 500
covariance = true
Covariate Effects
The clearance equation includes two covariate effects:
CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
- (WT/70)^THETA_WT: Allometric scaling of clearance with body weight, centered at 70 kg. THETA_WT is estimated (expected ~0.75 for CL).
- (CRCL/100)^THETA_CRCL: Renal function effect on clearance, centered at 100 mL/min. THETA_CRCL is estimated.
Data Requirements
The dataset must include WT and CRCL columns:
ID,TIME,DV,EVID,AMT,CMT,MDV,WT,CRCL
1,0,.,1,100,1,1,72.5,105
1,0.5,12.3,0,.,.,0,72.5,105
1,1.0,18.7,0,.,.,0,72.5,105
Covariate columns are automatically detected -- any column not in the standard NONMEM set is treated as a covariate. The names in the data file must match those used in [individual_parameters] (case-insensitive).
Running
ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv
Notes
- FOCEI is used here because the proportional error model creates an interaction between random effects and residual error
- Covariate centering (dividing by 70 for weight, 100 for CRCL) improves numerical stability and makes the typical value (TVCL) interpretable as the clearance for a 70 kg patient with CRCL of 100 mL/min
- The estimated covariate exponents (THETA_WT, THETA_CRCL) have standard errors that can be used to test significance
ODE Model (Michaelis-Menten)
This example demonstrates a one-compartment oral model with saturable (Michaelis-Menten) elimination, solved using the ODE integrator.
Model File (mm_oral.ferx)
# One-compartment oral model with Michaelis-Menten elimination
[parameters]
theta TVVMAX(10.0, 0.1, 1000.0)
theta TVKM(2.0, 0.01, 100.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
omega ETA_VMAX ~ 0.09
omega ETA_V ~ 0.04
sigma PROP_ERR ~ 0.1
[individual_parameters]
VMAX = TVVMAX * exp(ETA_VMAX)
KM = TVKM
V = TVV * exp(ETA_V)
KA = TVKA
[structural_model]
ode(obs_cmt=central, states=[depot, central])
[odes]
d/dt(depot) = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = focei
maxiter = 500
covariance = true
Model Description
Why Use ODEs
Michaelis-Menten (saturable) elimination cannot be described by a simple exponential function. At high concentrations, elimination approaches a maximum rate (VMAX), leading to nonlinear pharmacokinetics. This requires numerical ODE integration.
ODE System
d/dt(depot) = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)
- depot: Absorption compartment. First-order absorption at rate KA.
- central: Central compartment. Drug enters from depot and is eliminated by Michaelis-Menten kinetics.
Parameters
- VMAX: Maximum elimination rate (amount/time)
- KM: Michaelis-Menten constant (concentration at half-maximal rate)
- V: Volume of distribution
- KA: First-order absorption rate constant
Note that only VMAX and V have random effects. KM and KA are treated as population parameters (no between-subject variability).
Structural Model Declaration
[structural_model]
ode(obs_cmt=central, states=[depot, central])
obs_cmt=central: Thecentralcompartment is observed (matched to DV in the data)states=[depot, central]: Two state variables (compartments)
Running
ferx examples/mm_oral.ferx --data data/mm_oral.csv
Notes
- ODE models are slower than analytical models because each prediction requires numerical integration
- The ODE solver uses adaptive step-size control (Dormand-Prince RK45) for accuracy
- For standard first-order kinetics, use the analytical
one_cpt_oralmodel instead -- it is much faster - The
centralcompartment in this model contains the drug amount. If you want concentration as the observed variable, divide by V in the ODE equations (as done here withKA * depot / V)
BLOQ (M3 method)
This example shows how to fit a model to data that contains observations below
the assay lower limit of quantification (BLOQ) using Beal's M3 method.
Instead of dropping BLOQ rows — which biases terminal-phase parameter estimates
— each censored observation contributes
P(y < LLOQ | θ, η) = Φ((LLOQ − f)/√V) to the likelihood.
Dataset (data/warfarin-bloq.csv)
This is the warfarin dataset with an added CENS column. Ten late-time
observations that originally fell below an assay LLOQ of 2.0 µg/mL have been
marked CENS=1 and their DV cell set to the LLOQ value:
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,CENS
1,0,.,1,100,1,0,1,0
1,0.5,5.3653,0,.,1,0,0,0
...
1,96,2.5019,0,.,1,0,0,0
1,120,2,0,.,1,0,0,1
NONMEM CENS convention: when CENS=1, the row is censored and DV
carries the LLOQ value — not the true (unobserved) concentration. Rows with no
CENS column, or CENS=0, are treated as ordinary quantified observations.
Model File (examples/warfarin-bloq.ferx)
# One-compartment oral PK model (warfarin) with M3 BLOQ likelihood.
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = focei
maxiter = 300
covariance = true
bloq_method = m3
The only two lines that differ from a plain fit are bloq_method = m3 in
[fit_options] and the added CENS column in the CSV. Nothing else in the
model changes.
Running
ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv
Expected Results
--- Objective Function ---
OFV: -213.9104
--- THETA Estimates ---
TVCL 0.126263
TVV 7.629401
TVKA 1.083302
--- OMEGA Estimates ---
OMEGA(1,1) = 0.031381 (CV% = 17.7)
OMEGA(2,2) = 0.009723 (CV% = 9.9)
OMEGA(3,3) = 0.419327 (CV% = 64.8)
--- SIGMA Estimates ---
SIGMA(1) = 0.010766
The diagnostic table (warfarin-bloq-sdtab.csv) gains a CENS column, and the
IWRES / CWRES cells for censored rows are written as empty (a weighted
Gaussian residual is undefined when the observed value is censored).
Method Notes
- Activation requires both pieces: the
CENScolumn in the data file andbloq_method = m3in[fit_options]. Without the option,CENS=1rows are treated as ordinary observations at the LLOQ value, which biases the fit. - FOCE is auto-promoted to FOCEI on affected subjects. Mixing linearized
Gaussian residuals with non-linearized
log Φterms produces inconsistent OFVs near the LLOQ boundary, so whenmethod = foceand a subject has anyCENS=1row, that subject is evaluated with η-interaction. A notice is written toFitResult.warnings; setmethod = foceiexplicitly to silence it. - Gauss-Newton caveat. With
method = gnorgn_hybrid, the BHHH information-matrix approximation degrades as the BLOQ fraction grows (each censored row carries less Fisher information than its Gaussian counterpart). A warning is emitted; for >20% censoring preferfocei. - SAEM optimizes θ/σ with M3 in the M-step directly, so no special handling
is required beyond setting
bloq_method = m3.
Inter-Occasion Variability (IOV)
This example extends the one-compartment oral warfarin model to account for occasion-to-occasion variability in clearance. The complete model file is examples/warfarin_iov.ferx.
When to use IOV
Use IOV when:
- Subjects have data from multiple study periods (crossover designs, multiple visits)
- You expect within-subject PK variability across periods that is larger than pure residual error
- NONMEM models used
OMEGA BLOCK SAMEsyntax for one or more parameters
Dataset
The dataset must include an occasion-index column. Add an integer OCC column (or any name you choose — tell FeRx via iov_column):
ID,TIME,DV,EVID,AMT,CMT,MDV,OCC
1,0,.,1,100,1,1,1
1,1,9.49,0,.,.,0,1
1,2,14.42,0,.,.,0,1
1,3,17.56,0,.,.,0,1
1,24,.,1,100,1,1,2
1,25,10.1,0,.,.,0,2
1,26,8.2,0,.,.,0,2
Occasion 1 = first period, occasion 2 = second period. Dose records and observation records in the same period share the same OCC value.
Model File (Option A — diagonal IOV)
This is the contents of examples/warfarin_iov.ferx:
model warfarin_iov
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09 # between-subject variability in CL
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
kappa KAPPA_CL ~ 0.01 # inter-occasion variability in CL
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL + KAPPA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
iov_column = OCC
covariance = false
The key additions compared to a standard model:
kappa KAPPA_CL ~ 0.01— declares IOV on CL with starting variance 0.01+ KAPPA_CLin the CL expression — kappa enters just like a BSV etaiov_column = OCC— tells FeRx which dataset column carries occasion labels
Set covariance = true if you want standard errors on the IOV variance (the shipped example leaves it off to keep the demo fast).
Model File (Option B — correlated IOV)
If you have IOV on both CL and V and suspect they covary across occasions, use block_kappa:
[parameters]
...
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
block_kappa (KAPPA_CL, KAPPA_V) = [0.05, 0.01, 0.03]
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL + KAPPA_CL)
V = TVV * exp(ETA_V + KAPPA_V)
KA = TVKA * exp(ETA_KA)
The three values in [0.05, 0.01, 0.03] are the lower triangle of Ω_IOV:
0.05= Var(KAPPA_CL)0.01= Cov(KAPPA_CL, KAPPA_V)0.03= Var(KAPPA_V)
Running the Fit
The repository does not ship a warfarin dataset with occasion labels — point --data at your own NONMEM-format CSV that includes an OCC column (see Dataset above):
ferx examples/warfarin_iov.ferx --data path/to/your_occ_data.csv
Or via the Rust API:
#![allow(unused)] fn main() { let result = fit_from_files("examples/warfarin_iov.ferx", "path/to/your_occ_data.csv")?; println!("Omega IOV: {:?}", result.omega_iov); }
Interpreting Output
The console summary gains a KAPPA (IOV) Estimates block, and the fit YAML gains an omega_iov: section.
Console output (Option A, single kappa on CL):
--- KAPPA (IOV) Estimates ---
KAPPA_CL = 0.041200 (CV% = 20.3) SE = 0.008000
For Option B with off-diagonal correlations a --- Correlations --- block follows the diagonal entries.
Per-subject, per-occasion kappa EBEs are returned on FitResult.ebe_kappas. They are not yet emitted as columns in the sdtab CSV — access them via the Rust API if you need per-occasion plots:
#![allow(unused)] fn main() { for (i, subject_kappas) in result.ebe_kappas.iter().enumerate() { for (k, kappa_vec) in subject_kappas.iter().enumerate() { println!("subject {} occasion {}: {:?}", i + 1, k + 1, kappa_vec); } } }
Note: per-kappa shrinkage is not yet computed (
shrinkage_kappais always returned empty). If you need a shrinkage diagnostic for IOV, derive it manually fromebe_kappasandomega_iov.
Tips
- Start with IOV on CL only — it is the most commonly occasion-sensitive parameter. Add IOV on other parameters only when the OFV improvement justifies it.
- Compare OFV between BSV-only and IOV models — the difference is a likelihood-ratio test for the added IOV variance(s). Note that for a variance parameter at the boundary (H₀: σ² = 0) the asymptotic null is a 50:50 mixture of a point mass at 0 and χ²₁, not a plain χ²₁.
- Use FOCEI for proportional errors — the interaction term matters more when individual predictions vary across occasions.
- SAEM is not supported with IOV — use
foceorfocei.
Rust API
FeRx-NLME can be used as a Rust library for embedding population PK estimation in your own applications.
Adding as a Dependency
[dependencies]
ferx-nlme = { path = "../ferx-nlme", features = ["autodiff"] }
Quick Example
use ferx_nlme::*; use std::path::Path; fn main() -> Result<(), String> { // Parse model and data let parsed = parse_full_model_file(Path::new("model.ferx"))?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; // Build initial parameters and options let (init_params, options) = build_fit_inputs(&parsed)?; // Run estimation let result = fit(&parsed.model, &population, &init_params, &options)?; // Access results println!("OFV: {}", result.ofv); for (name, val) in result.theta_names.iter().zip(result.theta.iter()) { println!(" {} = {:.6}", name, val); } Ok(()) }
API Sections
- Core Types --
CompiledModel,Population,FitResult,FitOptions, etc. - Fitting Functions --
fit(),fit_from_files(),run_model_with_data() - Simulation --
simulate(),simulate_with_seed(),predict() - Parsing --
parse_model_file(),parse_full_model_file(),read_nonmem_csv()
Core Types
The main types exported by ferx_nlme. All types are available via use ferx_nlme::*.
CompiledModel
A parsed and compiled model ready for estimation. Created by parse_model_file() or parse_full_model_file().
#![allow(unused)] fn main() { pub struct CompiledModel { pub name: String, pub pk_model: PkModel, pub error_model: ErrorModel, pub pk_param_fn: PkParamFn, // (theta, eta, covariates) -> PkParams pub n_theta: usize, pub n_eta: usize, pub n_epsilon: usize, pub theta_names: Vec<String>, pub eta_names: Vec<String>, pub default_params: ModelParameters, pub tv_fn: Option<...>, // Typical values function (for AD) pub pk_indices: Vec<usize>, // Maps eta index -> PK parameter index pub ode_spec: Option<OdeSpec>, // ODE specification (if ODE model) } }
Key fields:
pk_param_fn: Closure that maps (theta, eta, covariates) to PK parameters. Generated by the parser from[individual_parameters].default_params: Initial parameter values from[parameters]block.ode_spec: Present only for ODE models;Nonefor analytical models.
Population and Subject
#![allow(unused)] fn main() { pub struct Population { pub subjects: Vec<Subject>, pub covariate_names: Vec<String>, pub dv_column: String, } pub struct Subject { pub id: String, pub doses: Vec<DoseEvent>, pub obs_times: Vec<f64>, pub observations: Vec<f64>, pub obs_cmts: Vec<usize>, pub covariates: HashMap<String, f64>, // Subject-static fallback (first values) pub dose_covariates: Vec<HashMap<String, f64>>, // Per-dose LOCF snapshot (empty if no TV cov) pub obs_covariates: Vec<HashMap<String, f64>>, // Per-obs LOCF snapshot (empty if no TV cov) pub cens: Vec<u8>, pub occasions: Vec<u32>, pub dose_occasions: Vec<u32>, } }
dose_covariates and obs_covariates are populated only when at least one
covariate varies within the subject — the no-TV-cov fast paths use
covariates directly. Helpers subject.has_tv_covariates(),
subject.dose_cov(k), and subject.obs_cov(j) abstract over this.
ModelParameters
#![allow(unused)] fn main() { pub struct ModelParameters { pub theta: Vec<f64>, pub theta_names: Vec<String>, pub theta_lower: Vec<f64>, pub theta_upper: Vec<f64>, pub omega: OmegaMatrix, pub sigma: SigmaVector, } }
OmegaMatrix
Between-subject variability matrix with pre-computed Cholesky factor.
#![allow(unused)] fn main() { pub struct OmegaMatrix { pub matrix: DMatrix<f64>, // Full omega matrix pub chol: DMatrix<f64>, // Lower Cholesky factor L (Omega = L*L') pub eta_names: Vec<String>, pub diagonal: bool, // True if omega is diagonal } }
Constructors:
OmegaMatrix::from_matrix(m, names, diagonal)-- From a full matrix (computes Cholesky, regularizes if needed)OmegaMatrix::from_diagonal(variances, names)-- From diagonal variances
FitOptions
#![allow(unused)] fn main() { pub struct FitOptions { pub method: EstimationMethod, // Foce, FoceI, or Saem pub outer_maxiter: usize, // Default: 500 pub outer_gtol: f64, // Default: 1e-6 pub inner_maxiter: usize, // Default: 200 pub inner_tol: f64, // Default: 1e-8 pub run_covariance_step: bool, // Default: true pub interaction: bool, // FOCEI flag pub verbose: bool, // Default: true pub optimizer: Optimizer, // Default: Slsqp pub lbfgs_memory: usize, // Default: 5 pub global_search: bool, // Default: false pub global_maxeval: usize, // Default: 0 (auto) // SAEM-specific pub saem_n_exploration: usize, // Default: 150 pub saem_n_convergence: usize, // Default: 250 pub saem_n_mh_steps: usize, // Default: 3 pub saem_adapt_interval: usize, // Default: 50 pub saem_seed: Option<u64>, // Default: None } }
Use FitOptions::default() for standard FOCE settings.
FitResult
#![allow(unused)] fn main() { pub struct FitResult { pub method: EstimationMethod, pub converged: bool, pub ofv: f64, pub aic: f64, pub bic: f64, pub theta: Vec<f64>, pub theta_names: Vec<String>, pub omega: DMatrix<f64>, pub sigma: Vec<f64>, pub covariance_matrix: Option<DMatrix<f64>>, pub se_theta: Option<Vec<f64>>, pub se_omega: Option<Vec<f64>>, pub se_sigma: Option<Vec<f64>>, pub subjects: Vec<SubjectResult>, pub n_obs: usize, pub n_subjects: usize, pub n_parameters: usize, pub n_iterations: usize, pub interaction: bool, pub warnings: Vec<String>, } }
SubjectResult
Per-subject diagnostics from the fit.
#![allow(unused)] fn main() { pub struct SubjectResult { pub id: String, pub eta: DVector<f64>, pub ipred: Vec<f64>, // Individual predictions pub pred: Vec<f64>, // Population predictions pub iwres: Vec<f64>, // Individual weighted residuals pub cwres: Vec<f64>, // Conditional weighted residuals pub ofv_contribution: f64, } }
Enums
#![allow(unused)] fn main() { pub enum EstimationMethod { Foce, FoceI, Saem } pub enum PkModel { OneCptIvBolus, OneCptOral, OneCptInfusion, TwoCptIvBolus, TwoCptOral, TwoCptInfusion } pub enum ErrorModel { Additive, Proportional, Combined } pub enum Optimizer { Bfgs, Lbfgs, Slsqp, NloptLbfgs, Mma } }
DoseEvent
#![allow(unused)] fn main() { pub struct DoseEvent { pub time: f64, pub amt: f64, pub cmt: usize, pub rate: f64, pub duration: f64, pub ss: bool, pub ii: f64, } }
Fitting Functions
fit()
The primary estimation entry point. Runs FOCE, FOCEI, or SAEM depending on options.method.
#![allow(unused)] fn main() { pub fn fit( model: &CompiledModel, population: &Population, init_params: &ModelParameters, options: &FitOptions, ) -> Result<FitResult, String> }
Parameters:
model: Compiled model fromparse_model_file()orparse_full_model_file()population: Population data fromread_nonmem_csv()init_params: Initial parameter valuesoptions: Estimation configuration
Returns: FitResult with parameter estimates, standard errors, and per-subject diagnostics.
Example:
#![allow(unused)] fn main() { let model = parse_model_file(Path::new("model.ferx"))?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; let options = FitOptions::default(); let result = fit(&model, &population, &model.default_params, &options)?; println!("OFV: {:.4}", result.ofv); }
fit_from_files()
Convenience wrapper that handles parsing and data reading.
#![allow(unused)] fn main() { pub fn fit_from_files( model_path: &str, data_path: &str, covariate_columns: Option<&[&str]>, options: Option<FitOptions>, ) -> Result<FitResult, String> }
Example:
#![allow(unused)] fn main() { let result = fit_from_files( "model.ferx", "data.csv", None, // Auto-detect covariates None, // Default options )?; }
run_model_with_data()
Full pipeline: parse model file, read data, fit. Returns both the fit result and the population.
#![allow(unused)] fn main() { pub fn run_model_with_data( model_path: &str, data_path: &str, ) -> Result<(FitResult, Population), String> }
Uses the [fit_options] from the model file.
run_model_simulate()
Simulation-estimation: parse model, generate data from [simulation] block, fit.
#![allow(unused)] fn main() { pub fn run_model_simulate( model_path: &str, ) -> Result<(FitResult, Population), String> }
Requires a [simulation] block in the model file.
build_fit_inputs()
Extract initial parameters and fit options from a parsed model, separating parsing from estimation for timing purposes.
#![allow(unused)] fn main() { pub fn build_fit_inputs( parsed: &ParsedModel, ) -> Result<(ModelParameters, FitOptions), String> }
Example:
#![allow(unused)] fn main() { let parsed = parse_full_model_file(Path::new("model.ferx"))?; let (init_params, options) = build_fit_inputs(&parsed)?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; let result = fit(&parsed.model, &population, &init_params, &options)?; }
Simulation
simulate()
Generate simulated observations from a model with random effects and residual error.
#![allow(unused)] fn main() { pub fn simulate( model: &CompiledModel, population: &Population, params: &ModelParameters, n_sim: usize, ) -> Vec<SimulationResult> }
Parameters:
model: Compiled modelpopulation: Template population (dose events and observation times are used; DV values are ignored)params: True parameter values for simulationn_sim: Number of simulation replicates
Returns: Vector of SimulationResult, one per observation per subject per replicate.
Example:
#![allow(unused)] fn main() { let model = parse_model_file(Path::new("model.ferx"))?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; // Simulate 1000 replicates let sims = simulate(&model, &population, &model.default_params, 1000); for sim in &sims[..5] { println!("Sim {}, ID {}, TIME {}, IPRED {:.3}, DV {:.3}", sim.sim, sim.id, sim.time, sim.ipred, sim.dv_sim); } }
simulate_with_seed()
Same as simulate() but with a fixed random seed for reproducibility.
#![allow(unused)] fn main() { pub fn simulate_with_seed( model: &CompiledModel, population: &Population, params: &ModelParameters, n_sim: usize, seed: u64, ) -> Vec<SimulationResult> }
predict()
Population predictions without random effects (eta = 0). No simulation noise is added.
#![allow(unused)] fn main() { pub fn predict( model: &CompiledModel, population: &Population, params: &ModelParameters, ) -> Vec<PredictionResult> }
Returns: Vector of PredictionResult with population-level predictions.
Example:
#![allow(unused)] fn main() { let preds = predict(&model, &population, &model.default_params); for p in &preds { println!("ID {}, TIME {}, PRED {:.3}", p.id, p.time, p.pred); } }
Result Types
#![allow(unused)] fn main() { pub struct SimulationResult { pub sim: usize, // Replicate number (1-indexed) pub id: String, // Subject ID pub time: f64, // Observation time pub ipred: f64, // Individual prediction (no residual error) pub dv_sim: f64, // Simulated observation (with residual error) } pub struct PredictionResult { pub id: String, pub time: f64, pub pred: f64, // Population prediction (eta = 0) } }
Simulation Process
For each replicate and each subject:
- Sample random effects: \( \eta_i \sim N(0, \Omega) \) using the Cholesky factor \( L \): \( \eta = L \cdot z \), where \( z \sim N(0, I) \)
- Compute individual PK parameters via
pk_param_fn(theta, eta, covariates) - Generate predictions using the structural model
- Add residual error: \( DV = IPRED + \sqrt{V} \cdot \epsilon \), where \( \epsilon \sim N(0, 1) \) and \( V \) is the residual variance from the error model
Parsing
Model Parsing
parse_model_file()
Parse a .ferx file into a CompiledModel. Only the core model blocks are processed ([parameters], [individual_parameters], [structural_model], [error_model]).
#![allow(unused)] fn main() { pub fn parse_model_file(path: &Path) -> Result<CompiledModel, String> }
parse_model_string()
Parse a model from a string instead of a file.
#![allow(unused)] fn main() { pub fn parse_model_string(content: &str) -> Result<CompiledModel, String> }
Example:
#![allow(unused)] fn main() { let model_str = r#" [parameters] theta TVCL(0.1, 0.001, 10.0) theta TVV(10.0, 0.1, 500.0) omega ETA_CL ~ 0.09 sigma ADD_ERR ~ 1.0 [individual_parameters] CL = TVCL * exp(ETA_CL) V = TVV [structural_model] pk one_cpt_iv_bolus(cl=CL, v=V) [error_model] DV ~ additive(ADD_ERR) "#; let model = parse_model_string(model_str)?; }
parse_full_model_file()
Parse a complete model file including [fit_options] and [simulation] blocks.
#![allow(unused)] fn main() { pub fn parse_full_model_file(path: &Path) -> Result<ParsedModel, String> }
Returns a ParsedModel which contains:
#![allow(unused)] fn main() { pub struct ParsedModel { pub model: CompiledModel, pub simulation: Option<SimulationSpec>, pub fit_options: FitOptions, } }
Data Parsing
read_nonmem_csv()
Read a NONMEM-format CSV file into a Population.
#![allow(unused)] fn main() { pub fn read_nonmem_csv( path: &Path, covariate_columns: Option<&[&str]>, ) -> Result<Population, String> }
Parameters:
path: Path to the CSV filecovariate_columns: Optional list of covariate column names. IfNone, all non-standard columns are auto-detected as covariates.
Example:
#![allow(unused)] fn main() { // Auto-detect covariates let pop = read_nonmem_csv(Path::new("data.csv"), None)?; // Explicit covariate list let pop = read_nonmem_csv( Path::new("data.csv"), Some(&["WT", "CRCL", "AGE"]), )?; println!("{} subjects, {} observations", pop.subjects.len(), pop.n_obs()); }
Data Processing Details
- Column names are matched case-insensitively
- Standard NONMEM columns (
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,II,SS) are recognized automatically - Missing values (
., empty string) are handled appropriately - Rows with
EVID=1are treated as dose events - Rows with
EVID=0andMDV=0are treated as observations - Time-constant covariates use the first non-missing value per subject
- Time-varying covariates use Last Observation Carried Forward (LOCF) per event —
[individual_parameters]is re-evaluated at each dose and observation row using that row's covariate values (NONMEM-equivalent semantics). Currently supported on 1- and 2-compartment IV bolus / infusion models and all ODE-defined models; oral and 3-compartment models fall back to a single first-row snapshot
Frequently Asked Questions
Do I need to use MU-referencing in my model definitions, like in NONMEM / nlmixr2?
No. You do not need to write an explicit MU_i intermediate variable — ferx detects the same structure automatically from the right-hand side of each [individual_parameters] line.
In NONMEM and nlmixr2, MU-referencing is a convention in which each random effect ETA(i) is linearly associated with a single MU_i term (typically MU_i = LOG(THETA(i))), so individual parameters look like:
MU_1 = LOG(THETA(1))
CL = EXP(MU_1 + ETA(1))
This structure is required by NONMEM's SAEM implementation, whose conjugate-Gibbs E-step is only valid when the MU_i → ETA(i) relationship is strictly linear (typically on a log scale). Deviating from it — for example writing CL = THETA(1) * EXP(ETA(1)) without going through an intermediate MU_1 — causes NONMEM SAEM to reject the model or silently produce biased estimates.
How ferx handles it
In ferx, you simply write the individual-parameter line in the form that makes sense for your model. The parser inspects each line and records which THETA acts as the "anchor" for each ETA, then uses that information to re-centre the estimation search at every outer iteration. The following patterns are detected automatically:
CL = TVCL * exp(ETA_CL) # multiplicative exp — most common
CL = exp(log(TVCL) + ETA_CL) # canonical MU form
CL = TVCL * (WT/70)^0.75 * exp(ETA_CL) # with covariate adjustment
CL = TVCL + ETA_CL # additive eta (linear shift)
The parser records ETA_CL → (TVCL, log_transformed) and at each outer step computes the shift
mu[i] = log(theta[i]) # for multiplicative/exp patterns
mu[i] = theta[i] # for additive patterns
The inner optimizer (and the SAEM exploration-phase MH proposal) is then centred on this mu shift rather than on eta = 0. This matches the convergence behaviour that NONMEM / nlmixr2 get from an explicit MU_i = LOG(THETA(i)) block, without requiring you to write it.
Patterns that ferx does not auto-detect (and therefore falls back to the plain eta = 0 start) include:
- Parameters with two or more free thetas in the product (
CL = TVCL * TVSCALE * exp(ETA_CL)) — ambiguous anchor. - Compound eta expressions inside the
exp(CL = TVCL * exp(ETA_CL + ETA_OCC)). - Non-standard transforms the parser doesn't recognise.
Mu-referencing being absent is not an error — it just means that eta for that parameter is initialised to zero at each outer step, same as the pre-automatic behaviour. When a fit runs, ferx records the list of auto-detected mu-referenced ETAs in FitResult.warnings (as a line starting with mu-ref: …) so you can confirm what the parser picked up.
Why it matters
Mu-referencing mostly affects convergence speed and robustness, not the final MLE:
- FOCE / FOCEI: each outer step re-optimises ETA by BFGS. With a mu-shift, the warm-started BFGS sees a much better starting point when
THETAmoves away from the initial estimate, so fewer inner iterations are needed and pathological steps are less likely. - SAEM: during the exploration phase, Metropolis–Hastings proposals are centred on
mu_kinstead of on the current chain state. This helps the chain escape the (incorrect)eta = 0basin whenTVCLis still far from the true value. During the convergence phase the proposal reverts to a symmetric random walk so that detailed balance holds.
Because the estimator is MLE in both cases, models that converge without mu-referencing will converge to the same estimates with it — just (usually) in fewer iterations.
Disabling it
If you want to benchmark against the pre-2026 behaviour, set:
[fit_options]
mu_referencing = false
or, from the Rust API, FitOptions { mu_referencing: false, .. }. The default is true.
Porting from NONMEM
If you have a NONMEM model that uses an explicit MU_1 = LOG(THETA(1)) line, just drop the MU_i intermediate and write the individual parameter directly — ferx will detect the equivalent mu-reference automatically and the fit will be equivalent.
Can I use if / else statements like in NONMEM or nlmixr2?
Yes. ferx supports two forms of conditional logic in both
[individual_parameters] and [odes] blocks:
# Block form
if (WT > 70) {
CL = TVCL * (WT / 70)^0.75 * exp(ETA_CL)
} else if (SEX == 1) {
CL = TVCL * 1.2 * exp(ETA_CL)
} else {
CL = TVCL * exp(ETA_CL)
}
# Inline (ternary) form
CL = if (SEX == 1) TVCL * 1.5 else TVCL
Conditions support comparisons (<, <=, >, >=, ==, !=) and
logical operators (&&, ||, !). Either form may be combined with
arbitrary arithmetic, including covariate references and other ETAs.
Compared to NONMEM: NONMEM uses Fortran-style IF (cond) THEN ... ELSE ... ENDIF and Fortran comparison aliases (.GT., .LE., etc.). ferx uses
C-style braces and operator symbols. The semantics are otherwise the same —
exactly one branch fires per evaluation, the rest are ignored.
Compared to nlmixr2: nlmixr2's if (cond) { ... } else { ... } syntax
is a near-verbatim match for ferx's block form. Drop the <- assignments
in favour of = and the model translates directly.
Caveat: when a parameter is assigned inside an if block, ferx skips
mu-reference detection for that parameter (the (ETA → THETA) link is no
longer unconditional). See
Individual Parameters
for the workaround if you need both conditional logic and mu-referencing
on the same parameter.
A note on == and !=: these operators do an exact bitwise float
comparison. They work as you'd expect for integer-coded covariates
(SEX == 1, STUDY != 3) but are unreliable on continuous values, where
floating-point round-off can flip the result. For continuous thresholds
prefer a bracketed comparison (WT >= 70 && WT < 80) over WT == 75.
Which outer optimizer should I pick?
slsqp (the default) is the right choice for most models — it is fast, handles
box constraints cleanly, and behaves well on the log-transformed parameter
scale that ferx uses internally.
Reach for a different optimizer when SLSQP misbehaves:
bobyqa— derivative-free, good when FOCE's FD gradients are noisy and SLSQP stalls or oscillates. Slower per iteration on smooth problems, but often converges when gradient-based methods give up.trust_region— second-order Newton trust-region. Can be faster near convergence because it uses curvature information; tune the CG budget withsteihaug_max_iters(default 50) if you have more than ~50 packed parameters.lbfgs/bfgs— fall back to these only when NLopt is unavailable.
See Fit Options for the full list.