Introduction
FeRx-NLME is a high-performance Nonlinear Mixed Effects (NLME) modeling engine written in Rust. It is designed for population pharmacokinetic (PopPK) analysis, implementing the same statistical methodology as established tools like NONMEM, Monolix, and Pumas.
Key Features
- FOCE/FOCEI estimation -- First-Order Conditional Estimation with optional interaction, the gold standard for PopPK
- SAEM estimation -- Stochastic Approximation EM for robust convergence on complex models
- Analytical PK solutions -- Built-in one- and two-compartment models (IV bolus, oral, infusion) with numerical stability guarantees
- ODE solver -- Dormand-Prince RK45 adaptive integrator for custom kinetic models (e.g. Michaelis-Menten)
- Automatic differentiation -- Enzyme-based reverse-mode AD for fast, exact gradients
- Simple model DSL -- Declarative
.ferxmodel files that read like equations - NONMEM-compatible data -- Reads standard NONMEM CSV datasets directly
- Covariate support -- Time-constant and time-varying covariates with automatic detection
- Parallel estimation -- Per-subject computations parallelized via Rayon
How It Compares
| Feature | FeRx | NONMEM | Monolix | Pumas |
|---|---|---|---|---|
| Language | Rust | Fortran | C++ | Julia |
| FOCE/FOCEI | Yes | Yes | No | Yes |
| SAEM | Yes | No | Yes | Yes |
| Analytical PK | Yes | Yes | Yes | Yes |
| ODE models | Yes | Yes | Yes | Yes |
| Auto-diff | Yes (Enzyme) | No | No | Yes (ForwardDiff) |
| Open source | Yes | No | No | No |
Architecture Overview
FeRx uses a two-level optimization structure:
- Outer loop: Optimizes population parameters (theta, omega, sigma) using NLopt or built-in BFGS
- Inner loop: For each subject, finds empirical Bayes estimates (EBEs) of random effects by minimizing individual negative log-likelihood
Parameters are internally transformed for unconstrained optimization: theta and sigma are log-transformed, and omega uses Cholesky factorization to guarantee positive-definiteness.
Project Structure
src/
types.rs -- Core data structures
api.rs -- Public API (fit, simulate, predict)
parser/ -- .ferx model file parser
pk/ -- Analytical PK solutions
ode/ -- ODE solver and predictions
estimation/ -- FOCE, FOCEI, SAEM algorithms
stats/ -- Likelihood and residual computations
io/ -- Data reading and output writing
ad/ -- Automatic differentiation
bin/run_model.rs -- CLI binary
License
FeRx-NLME is released under the MIT License. Copyright © 2026 InsightRX, Inc.
Getting Started
This section covers installing FeRx and running your first model.
- Installation -- Setting up the Rust toolchain and building FeRx
- Quick Start -- Running a complete PopPK analysis in under 5 minutes
Installation
FeRx requires a nightly Rust toolchain with the Enzyme LLVM plugin for automatic differentiation. As of 2026, Enzyme is not yet distributed via rustup, so a one-time plugin build is required.
Pick your platform:
- Linux — fully supported, one-time ~30 min plugin build
- macOS — supported with caveats
- Windows — not supported (see why below)
Linux
Tested on Ubuntu 22.04. Other distributions (Debian, Fedora, Arch) should work with adjustments to the package manager commands.
1. Install rustup + upstream nightly
Do not use snap's rustup — its filesystem confinement breaks on non-standard home directories (common on enterprise servers).
# Remove snap rustup if you had it:
sudo snap remove rustup
# Official installer
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup toolchain install nightly
2. Install system dependencies and matching LLVM
Check which LLVM major version nightly needs:
rustc +nightly --version --verbose | grep LLVM
# e.g. "LLVM version: 22.1.2" — major is 22
Use that major version (<MAJOR>) below.
sudo apt install -y cmake ninja-build clang libssl-dev pkg-config \
python3 build-essential curl git libzstd-dev
# Install matching LLVM from apt.llvm.org (Ubuntu's defaults lag behind):
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh <MAJOR>
# Fix GPG keyring permissions if apt warns:
sudo chmod 644 /etc/apt/trusted.gpg.d/apt.llvm.org.asc
sudo apt update
sudo apt install -y llvm-<MAJOR>-dev clang-<MAJOR>
For other distros, install the matching llvm-dev + clang packages from your package manager. The LLVM major version must exactly match what rustc reports.
3. Build the Enzyme plugin
git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build
cmake -G Ninja .. \
-DLLVM_DIR=/usr/lib/llvm-<MAJOR>/lib/cmake/llvm \
-DENZYME_CLANG=OFF \
-DENZYME_FLANG=OFF
ninja
# 15–30 min
4. Install the plugin into nightly's sysroot
This location is not obvious — rustc looks in lib/rustlib/<target>/lib/, not just lib/. Despite the error wording ("folder not found"), it's searching for a file:
SYSROOT=$(rustc +nightly --print sysroot)
TARGET=x86_64-unknown-linux-gnu # or aarch64-unknown-linux-gnu on ARM
cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.so \
$SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.so
Note the filename rewrite: LLVMEnzyme-<N>.so → libEnzyme-<N>.so (with lib prefix).
5. Register the toolchain as enzyme
ferx's build system pins to a toolchain named enzyme:
rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme --version
6. Verify
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
Expected: error[E0601]: main function not found. That's the success signal. If you see autodiff backend not found in the sysroot, the .so is missing or in the wrong place (step 4).
Multi-user servers
For shared Linux servers, stage the built toolchain in /opt/rust-nightly so all users can share one build:
sudo mkdir -p /opt/rust-nightly
sudo cp -a ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/. /opt/rust-nightly/
sudo chown -R root:root /opt/rust-nightly
sudo chmod -R a+rX /opt/rust-nightly
sudo chmod a+rx /opt/rust-nightly/bin/*
Each user then links the shared toolchain into their own rustup:
# per user, once
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain none
rustup toolchain link enzyme /opt/rust-nightly
For R users, add to ~/.Renviron:
PATH=/opt/rust-nightly/bin:${HOME}/.cargo/bin:${PATH}
RUSTUP_TOOLCHAIN=enzyme
macOS
Supported on both Intel and Apple Silicon, with caveats around LLVM installation.
1. Install Xcode Command Line Tools
xcode-select --install
2. Install rustup + upstream nightly
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"
rustup toolchain install nightly
3. Install matching LLVM via Homebrew
Check which LLVM major version nightly needs:
rustc +nightly --version --verbose | grep LLVM
Homebrew only ships a handful of LLVM versions at a time. If the version you need is on Homebrew:
brew install llvm@<MAJOR>
# Path will be /opt/homebrew/opt/llvm@<MAJOR> (Apple Silicon)
# or /usr/local/opt/llvm@<MAJOR> (Intel)
If Homebrew doesn't have your version, you'll need to build LLVM from source (much longer — consider sticking with a slightly older nightly that matches an available LLVM version).
4. Build the Enzyme plugin
git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build
# Adjust paths for your brew prefix (/opt/homebrew or /usr/local)
cmake -G Ninja .. \
-DLLVM_DIR=$(brew --prefix)/opt/llvm@<MAJOR>/lib/cmake/llvm \
-DENZYME_CLANG=OFF \
-DENZYME_FLANG=OFF
ninja
You may need to install ninja if not present: brew install ninja cmake.
5. Install the plugin into nightly's sysroot
SYSROOT=$(rustc +nightly --print sysroot)
# On Apple Silicon:
TARGET=aarch64-apple-darwin
# On Intel:
# TARGET=x86_64-apple-darwin
# Note: on macOS the shared library extension is .dylib, not .so
cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.dylib \
$SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.dylib
6. Register and verify
rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
# Expect: error[E0601]: `main` function not found
macOS caveats
- Apple Silicon (M1/M2/M3/M4): use
aarch64-apple-darwinasTARGET. Intel Macs usex86_64-apple-darwin - System Integrity Protection (SIP): If you see code-signing errors loading the
.dylibduringrustcinvocation, trysudo codesign --force --sign - <path-to-libEnzyme.dylib>— should be rare on dev machines - Nightly toolchain distribution mismatches: Apple Silicon nightlies occasionally lag x86_64 by a day or two; if LLVM version mismatches after
rustup update, prefer installing a specific dated nightly
Windows
Windows is not currently supported. The blockers are:
-
EnzymeAD plugin build on MSVC is not well-tested. The Enzyme project primarily targets Linux with LLVM clang/gcc. While Windows builds exist in theory, the toolchain integration (rustc sysroot path conventions, plugin discovery, MSVC vs MinGW linking) has known issues that haven't been worked through.
-
Rust autodiff feature gate interactions with MSVC codegen. The
-Z autodiff=Enablepath has not been exercised on the x86_64-pc-windows-msvc target. Anecdotal reports from the Rust autodiff tracking issue show crashes or "backend not found" errors on Windows even when the plugin.dllis present. -
We haven't set up CI for Windows. Without a green CI signal we can't promise the build works.
Workarounds for Windows users
- WSL2 (recommended): Install Ubuntu under WSL2 and follow the Linux instructions. Performance is near-native for CPU-bound workloads like model fitting.
- Docker: Use the forthcoming ferx-nlme Docker image (coming soon) which ships with the toolchain pre-installed.
- Remote dev: SSH into a Linux server or cloud VM.
Future Windows support
Once upstream Enzyme integration lands in rustup (tracked at rust-lang/rust autodiff tracking issue), Windows support should become straightforward since the plugin distribution problem goes away. No ETA; the feature is still experimental in rustc.
If you're a Windows developer who would like to help, a CI run against x86_64-pc-windows-msvc + contributed docs for that platform would be very welcome.
Building ferx-nlme from source
Once your Enzyme toolchain is set up:
git clone https://github.com/InsightRX/ferx-nlme
cd ferx-nlme
cargo build --release --features autodiff
# Binary at target/release/ferx
Build options
# Debug build (faster compile, slower runtime)
cargo build --features autodiff
# Quick type-check without building
cargo check --features autodiff
# Lints
cargo clippy --features autodiff
# CI build without autodiff (uses finite differences — no Enzyme needed)
cargo build --release --no-default-features --features ci
The ci feature is useful for development on machines without the full Enzyme toolchain — at the cost of much slower gradient computation.
Verify the build
cargo run --release --features autodiff --bin ferx -- examples/warfarin.ferx --simulate
Should print a successful model fit with parameter estimates.
Installing the ferx R package
Ensure the Enzyme toolchain is set up (above). Then:
devtools::install_github("InsightRX/ferx")
The R package's build is driven by its Makevars, which invokes cargo and resolves the enzyme toolchain via rustup. Both must be set up correctly in your shell/Renviron before calling install_github.
See the ferx R package README for API usage.
Cancelling a running fit (Ctrl-C)
ferx_fit() runs the estimator on a worker thread and polls for R interrupts on the main thread every ~100 ms, so Ctrl-C (or RStudio's red stop button) aborts the fit cleanly. The worker exits at the next safe checkpoint — typically within a second or two, but up to one inner-loop evaluation for heavy ODE models. The call then returns with an R error:
Error: ferx_fit: cancelled by user
Ctrl-Z (SIGTSTP) will not abort the fit — it suspends the whole R process to the shell. Use Ctrl-C instead.
Dependencies
FeRx depends on these crates (managed automatically by Cargo):
| Crate | Purpose |
|---|---|
nalgebra | Linear algebra (matrices, Cholesky) |
nlopt | Nonlinear optimization (SLSQP, L-BFGS, MMA) |
rayon | Parallel computation |
rand, rand_distr | Random number generation (SAEM, SIR) |
csv | CSV data file reading |
regex | Model file expression parsing |
The nlopt crate requires the NLopt C library. Most platforms handle this automatically; if the build fails on NLopt, install it via your system package manager:
# macOS
brew install nlopt
# Ubuntu/Debian
sudo apt install libnlopt-dev
# Fedora
sudo dnf install NLopt-devel
Troubleshooting
"error: the option 'Z' is only accepted on the nightly compiler"
Your shell (or R) is finding a non-nightly rustc. Check rustc --version and your PATH. For R, verify Sys.which("rustc") and ~/.Renviron.
"autodiff backend not found in the sysroot: failed to find a libEnzyme-<N> folder"
Despite the wording ("folder"), rustc is searching for a file. Causes:
- Wrong directory: the
.so/.dylibis in<sysroot>/lib/instead of<sysroot>/lib/rustlib/<target>/lib/ - LLVM version mismatch: rebuild Enzyme against the LLVM version rustc reports
- Filename: must be
libEnzyme-<MAJOR>.so(Linux) orlibEnzyme-<MAJOR>.dylib(macOS), withlibprefix
"custom toolchain 'enzyme' specified in override file ... is not installed"
Run rustup toolchain link enzyme <path-to-nightly-sysroot>.
"not a directory: '/<path>/lib'" from rustup toolchain link
Permission issue — the user running toolchain link can't read the target path. On shared installs, run sudo chmod -R a+rX /opt/rust-nightly.
"incorrect value 'X' for unstable option 'autodiff'"
Valid autodiff values change between nightly builds. Test with:
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
If Enable is rejected, try LooseTypes, Inline, or PrintTA.
"Enzyme: cannot handle (forward) unknown intrinsic llvm.maximumnum"
Recent rustc lowers f64::max()/f64::min() to intrinsics Enzyme can't yet differentiate. This is a ferx-nlme code-level issue — AD-instrumented functions must use manual if expressions. Should not happen on released ferx-nlme versions; if it does, please file an issue.
"cargo is unavailable for the active toolchain" (info, not error)
Cargo wasn't copied into your linked toolchain. Either copy it (cp ~/.cargo/bin/cargo /opt/rust-nightly/bin/) or ignore — rustup falls back to nightly's cargo, which works.
Refreshing when upstream nightly rolls forward
If a new ferx-nlme release references stdlib items your cached toolchain doesn't have (e.g. autodiff_forward not found), rebuild /opt/rust-nightly against the current nightly, and rebuild Enzyme if the LLVM major version changed.
Quick Start
This walkthrough fits a one-compartment oral PK model to warfarin data.
1. Create a model file
Save the following as warfarin.ferx:
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
maxiter = 300
covariance = true
2. Prepare your data
Data must be in NONMEM CSV format. The minimum required columns are ID, TIME, and DV. Dosing records use EVID=1 with AMT:
ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,1,14.42,0,.,.,0
1,2,17.56,0,.,.,0
1,4,15.23,0,.,.,0
1,8,10.15,0,.,.,0
1,12,6.75,0,.,.,0
1,24,2.24,0,.,.,0
3. Run the fit
ferx warfarin.ferx --data warfarin.csv
4. Interpret the output
The console output shows the estimation progress and final results:
Starting FOCE estimation...
10 subjects, 110 observations
3 thetas, 3 etas, 1 sigmas
...
Final OFV = -280.1838
============================================================
NONLINEAR MIXED EFFECTS MODEL ESTIMATION
============================================================
Converged: YES
Estimation method: FOCE
--- Objective Function ---
OFV: -280.1838
AIC: -266.1838
BIC: -247.2804
--- THETA Estimates ---
Parameter Estimate SE %RSE
----------------------------------------------------
TVCL 0.132735 0.014549 11.0
TVV 7.694842 0.293028 3.8
TVKA 0.757498 0.034986 4.6
--- OMEGA Estimates (variances) ---
OMEGA(1,1) = 0.028584 (CV% = 16.9) SE = 0.006394
OMEGA(2,2) = 0.009613 (CV% = 9.8) SE = 0.002165
OMEGA(3,3) = 0.340868 (CV% = 58.4) SE = 0.076351
--- SIGMA Estimates ---
SIGMA(1) = 0.010638 SE = 0.000788
5. Output files
Three files are generated:
| File | Contents |
|---|---|
warfarin-sdtab.csv | Per-observation diagnostics (PRED, IPRED, CWRES, IWRES, ETAs) |
warfarin-fit.yaml | Parameter estimates with standard errors in YAML format |
warfarin-timing.txt | Wall-clock estimation time |
Next Steps
- Learn the full Model File Reference
- Explore Estimation Methods (FOCE, FOCEI, SAEM)
- See more Examples
Model File Reference
FeRx models are defined in .ferx files using a declarative DSL. Each file is organized into blocks, each denoted by a [block_name] header.
Block Overview
| Block | Required | Purpose |
|---|---|---|
[parameters] | Yes | Define theta, omega, and sigma parameters |
[individual_parameters] | Yes | Map population parameters to individual PK parameters |
[structural_model] | Yes | Specify the PK model (analytical or ODE) |
[error_model] | Yes | Define the residual error model |
[odes] | If ODE | ODE right-hand-side equations |
[fit_options] | No | Configure estimation method and optimizer |
[simulation] | No | Define a simulation trial design |
Minimal Example
[parameters]
theta TVCL(0.1, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
sigma ADD_ERR ~ 1.0
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
[structural_model]
pk one_cpt_iv_bolus(cl=CL, v=V)
[error_model]
DV ~ additive(ADD_ERR)
Lines beginning with # are treated as comments.
Parameters
The [parameters] block defines all model parameters: fixed effects (theta), between-subject variability (omega), and residual error (sigma).
Theta (Fixed Effects)
theta NAME(initial_value, lower_bound, upper_bound)
- NAME: Parameter name (used in
[individual_parameters]expressions) - initial_value: Starting value for estimation
- lower_bound: Lower bound constraint (must be > 0; parameters are log-transformed internally)
- upper_bound: Upper bound constraint
Example:
theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
Omega (Between-Subject Variability)
Diagonal omega
omega NAME ~ variance
- NAME: Random effect name (used in
[individual_parameters]asETA_XXX) - variance: Initial variance estimate (diagonal element of the omega matrix)
Example:
omega ETA_CL ~ 0.07
omega ETA_V ~ 0.02
omega ETA_KA ~ 0.40
Each variance represents the between-subject variability for that parameter. The coefficient of variation (CV%) is approximately sqrt(variance) * 100 for log-normally distributed parameters. For example, omega ETA_CL ~ 0.09 corresponds to ~30% CV.
Block omega (correlated random effects)
To estimate correlations between random effects, use block_omega:
block_omega (NAME1, NAME2, ...) = [lower_triangle_values]
The values are the lower triangle of the covariance matrix, specified row-wise. For a 2x2 block:
block_omega (ETA_CL, ETA_V) = [var_CL, cov_CL_V, var_V]
For a 3x3 block:
block_omega (ETA_CL, ETA_V, ETA_KA) = [var_CL, cov_CL_V, var_V, cov_CL_KA, cov_V_KA, var_KA]
You can mix diagonal and block omega specifications. Diagonal omegas specify uncorrelated random effects, while block omegas estimate the full covariance sub-matrix:
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40
This estimates a 3x3 omega where ETA_CL and ETA_V are correlated (2x2 block), but ETA_KA is uncorrelated with both.
Declaration order
The order of omega and block_omega lines in the [parameters] block determines the ETA indexing throughout the model: in the omega matrix and all output. For example:
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40
produces ETA order [ETA_CL, ETA_V, ETA_KA] (indices 1, 2, 3), while:
omega ETA_KA ~ 0.40
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
produces [ETA_KA, ETA_CL, ETA_V] (indices 1, 2, 3). The [individual_parameters] block should list assignments in the same order for clarity, though the parameter mapping is by name, not position.
Sigma (Residual Error)
sigma NAME ~ value
- NAME: Residual error parameter name (referenced in
[error_model]) - value: Initial value
Example:
sigma PROP_ERR ~ 0.01
sigma ADD_ERR ~ 1.0
The interpretation of sigma depends on the error model:
| Error Model | Sigma Meaning |
|---|---|
| Additive | Standard deviation of additive error |
| Proportional | Coefficient of proportional error |
| Combined | First sigma = proportional coefficient, second = additive SD |
Complete Examples
Diagonal omega (no correlations):
[parameters]
theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
omega ETA_CL ~ 0.07
omega ETA_V ~ 0.02
omega ETA_KA ~ 0.40
sigma PROP_ERR ~ 0.01
Block omega (correlated CL and V):
[parameters]
theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40
sigma PROP_ERR ~ 0.01
Individual Parameters
The [individual_parameters] block defines how population parameters (theta), random effects (eta), and covariates combine to produce individual PK parameters.
Syntax
PARAM = expression
Each line assigns a PK parameter using an arithmetic expression that can reference:
- Theta parameters -- names defined in
[parameters](e.g.,TVCL,TVV) - Eta random effects -- names defined as omega parameters (e.g.,
ETA_CL,ETA_V) - Covariates -- column names from the data file (e.g.,
WT,CRCL) - Constants -- numeric literals (e.g.,
70,0.75)
Supported Operators and Functions
| Operator/Function | Example |
|---|---|
+, -, *, / | TVCL * WT / 70 |
^ (power) | (WT/70)^0.75 |
exp() | exp(ETA_CL) |
log(), ln() | log(TVCL) |
sqrt() | sqrt(WT) |
abs() | abs(ETA_CL) |
| Parentheses | TVCL * (WT/70)^0.75 |
Common Parameterizations
Exponential (log-normal) random effects
The standard approach for PK parameters that must be positive:
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
Allometric scaling with covariates
[individual_parameters]
CL = TVCL * (WT/70)^0.75 * exp(ETA_CL)
V = TVV * (WT/70)^1.0 * exp(ETA_V)
Estimated covariate effects
Use additional theta parameters for covariate coefficients:
[parameters]
theta TVCL(0.134, 0.001, 10.0)
theta THETA_WT(0.75, 0.01, 2.0)
theta THETA_CRCL(0.5, 0.01, 2.0)
[individual_parameters]
CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
Covariate Detection
Any uppercase identifier in the expression that does not match a theta name or eta name is automatically treated as a covariate. The covariate value is read from the corresponding column in the data file.
For example, in CL = TVCL * (WT/70)^0.75 * exp(ETA_CL):
TVCLmatches a theta parameterETA_CLmatches an omega parameterWTmatches neither, so it is treated as a covariate column
PK Parameter Names
The parameter names on the left side of each assignment must map to recognized PK parameter names:
| Name | PK Parameter |
|---|---|
CL | Clearance |
V or V1 | Volume of distribution (central compartment) |
Q | Intercompartmental clearance |
V2 | Peripheral volume |
KA | Absorption rate constant |
F | Bioavailability (default 1.0 if omitted) |
For ODE models, the parameter names are user-defined and passed as a flat vector to the ODE right-hand side function.
Structural Model
The [structural_model] block specifies the pharmacokinetic model used to generate predictions.
Analytical PK Models
For standard compartmental models, use the pk keyword with a model function:
pk MODEL_NAME(param=VALUE, param=VALUE, ...)
Available Models
| Model Function | Compartments | Route | Required Parameters |
|---|---|---|---|
one_cpt_iv_bolus | 1 | IV bolus | cl, v |
one_cpt_oral | 1 | Oral | cl, v, ka |
one_cpt_infusion | 1 | IV infusion | cl, v |
two_cpt_iv_bolus | 2 | IV bolus | cl, v1, q, v2 |
two_cpt_oral | 2 | Oral | cl, v1, q, v2, ka |
two_cpt_infusion | 2 | IV infusion | cl, v1, q, v2 |
Examples
One-compartment oral:
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
Two-compartment IV bolus:
[structural_model]
pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)
Two-compartment oral:
[structural_model]
pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)
Bioavailability
For oral models, bioavailability (F) defaults to 1.0. To estimate it, define an F parameter in [individual_parameters] -- it will be automatically used by the oral PK functions.
Dose Handling
Analytical models support:
- Bolus doses: Instantaneous input (default when
RATE=0in data) - Infusions: Zero-order input (when
RATE>0in data) - Steady-state: Pre-computed steady-state concentrations (when
SS=1andII>0in data) - Dose superposition: Multiple doses are handled by summing contributions from each dose event
Numerical Stability
The analytical solutions include special handling for:
- Near-equal rate constants: When absorption and elimination rates are similar (KA ~ k), L'Hopital's rule is used to avoid division by zero
- Two-compartment eigenvalues: Vieta's formula is used for robust computation of alpha and beta
ODE Models
For non-standard kinetics (e.g., saturable elimination), use the ODE specification:
[structural_model]
ode(obs_cmt=COMPARTMENT_NAME, states=[state1, state2, ...])
See ODE Models for full ODE syntax.
Error Model
The [error_model] block defines the residual error structure, specifying how observed data (DV) relates to model predictions.
Syntax
DV ~ ERROR_TYPE(SIGMA_PARAMS)
Available Error Models
Additive
DV ~ additive(SIGMA_NAME)
The residual variance is constant across all predictions:
\[ \text{Var}(DV) = \sigma^2 \]
Use when measurement error is independent of concentration (e.g., assay with fixed precision).
Proportional
DV ~ proportional(SIGMA_NAME)
The residual variance scales with the predicted value:
\[ \text{Var}(DV) = (\sigma \cdot f)^2 \]
where \( f \) is the model prediction. Use when measurement error increases with concentration (most common in PK).
Combined
DV ~ combined(SIGMA_PROP, SIGMA_ADD)
Combines proportional and additive components:
\[ \text{Var}(DV) = (\sigma_1 \cdot f)^2 + \sigma_2^2 \]
Use when both proportional and additive error sources are present. Requires two sigma parameters defined in [parameters].
Examples
Proportional error (most common):
[parameters]
sigma PROP_ERR ~ 0.01
[error_model]
DV ~ proportional(PROP_ERR)
Additive error:
[parameters]
sigma ADD_ERR ~ 1.0
[error_model]
DV ~ additive(ADD_ERR)
Combined error:
[parameters]
sigma PROP_ERR ~ 0.1
sigma ADD_ERR ~ 0.5
[error_model]
DV ~ combined(PROP_ERR, ADD_ERR)
Impact on Estimation
The error model affects:
- Individual weighted residuals (IWRES):
(DV - IPRED) / sqrt(Var) - Conditional weighted residuals (CWRES): Accounts for uncertainty in random effect estimates
- Objective function value (OFV): The likelihood includes
log(Var)terms, so the error model structure directly influences parameter estimates
ODE Models
For pharmacokinetic models without analytical solutions (e.g., saturable elimination, target-mediated drug disposition), FeRx provides an ODE solver.
Structural Model Declaration
[structural_model]
ode(obs_cmt=OBSERVABLE_COMPARTMENT, states=[state1, state2, ...])
- obs_cmt: The compartment whose concentration is observed (matched to DV)
- states: List of state variable names (compartments)
ODE Equations
The [odes] block defines the right-hand side of the ODE system:
[odes]
d/dt(state_name) = expression
Expressions can reference:
- State variables by name
- Individual parameters defined in
[individual_parameters] - Arithmetic operators and functions (
exp,log,sqrt, etc.)
Example: Michaelis-Menten Elimination
A one-compartment oral model with saturable (Michaelis-Menten) elimination:
[parameters]
theta TVVMAX(10.0, 0.1, 1000.0)
theta TVKM(2.0, 0.01, 100.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
omega ETA_VMAX ~ 0.09
omega ETA_V ~ 0.04
sigma PROP_ERR ~ 0.1
[individual_parameters]
VMAX = TVVMAX * exp(ETA_VMAX)
KM = TVKM
V = TVV * exp(ETA_V)
KA = TVKA
[structural_model]
ode(obs_cmt=central, states=[depot, central])
[odes]
d/dt(depot) = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)
[error_model]
DV ~ proportional(PROP_ERR)
Solver Details
FeRx uses a Dormand-Prince RK45 adaptive solver:
| Setting | Value |
|---|---|
| Method | Explicit Runge-Kutta 4(5) |
| Absolute tolerance | 1e-6 |
| Relative tolerance | 1e-4 |
| Max steps | 10,000 |
| Initial step size | 0.1 |
| Minimum step size | 1e-12 |
The solver automatically adapts step sizes based on local error estimates.
Dose Handling
- Bolus doses: Applied as instantaneous state changes at dose times. The dose amount is added to the target compartment:
state[cmt] += amt - Compartment indexing: Compartments are 1-indexed in the data file (
CMT=1corresponds to the first state in thestateslist) - Multiple doses: The ODE is integrated in segments between dose events, with state discontinuities at each dose
Limitations
- Infusion doses (
RATE > 0) are not currently supported for ODE models. Use bolus approximations or analytical models for infusions - The observable compartment contains the amount (not concentration). Divide by volume in the ODE equations if needed
- Steady-state (
SS=1) is not directly supported for ODE models
Fit Options
The optional [fit_options] block configures the estimation method and optimizer settings.
Syntax
[fit_options]
key = value
General Options
| Key | Values | Default | Description |
|---|---|---|---|
method | foce, focei, saem | foce | Estimation method |
maxiter | integer | 500 | Maximum outer loop iterations |
covariance | true, false | true | Compute covariance matrix and standard errors |
optimizer | slsqp, lbfgs, nlopt_lbfgs, mma, bfgs | slsqp | Optimization algorithm |
global_search | true, false | false | Run gradient-free pre-search before local optimization |
global_maxeval | integer | auto | Max evaluations for global search |
bloq_method | drop, m3 | drop | How to handle rows with CENS=1. m3 enables Beal's M3 likelihood (see BLOQ example). |
Estimation Methods
FOCE (default)
method = foce
First-Order Conditional Estimation. Linearizes the model around the empirical Bayes estimates. Fast and reliable for most models.
FOCEI
method = focei
FOCE with Interaction. Includes the dependence of the residual error on random effects. More accurate than FOCE when the error model depends on individual predictions, but slightly slower.
SAEM
method = saem
Stochastic Approximation EM. Uses Metropolis-Hastings sampling instead of MAP optimization for random effects. More robust to local minima; recommended for complex models with many random effects.
SAEM-Specific Options
| Key | Default | Description |
|---|---|---|
n_exploration | 150 | Phase 1 iterations (step size = 1) |
n_convergence | 250 | Phase 2 iterations (step size = 1/k) |
n_mh_steps | 3 | Metropolis-Hastings steps per subject per iteration |
adapt_interval | 50 | Iterations between MH step-size adaptation |
seed | 12345 | RNG seed for reproducibility |
SIR (Sampling Importance Resampling)
SIR provides non-parametric parameter uncertainty estimates as an optional post-estimation step. Requires covariance = true.
| Key | Default | Description |
|---|---|---|
sir | false | Enable SIR uncertainty estimation |
sir_samples | 1000 | Number of proposal samples (M) |
sir_resamples | 250 | Number of resampled vectors (m) |
sir_seed | 12345 | RNG seed for reproducibility |
See SIR documentation for details.
Optimizer Choices
| Optimizer | Description | Recommended For |
|---|---|---|
slsqp | Sequential Least Squares Programming (NLopt) | General use (default) |
bfgs | Built-in BFGS quasi-Newton | When NLopt is unavailable |
lbfgs | Limited-memory BFGS | Large parameter spaces |
nlopt_lbfgs | NLopt L-BFGS | Alternative L-BFGS |
mma | Method of Moving Asymptotes (NLopt) | Constrained problems |
Global Search
Setting global_search = true runs a gradient-free pre-search (NLopt CRS2-LM) before the local optimizer. This helps escape local minima on challenging datasets.
The number of global evaluations is auto-scaled based on the number of parameters and observations, or can be set explicitly with global_maxeval.
Examples
Standard FOCE with defaults:
[fit_options]
method = foce
maxiter = 300
covariance = true
FOCEI with global search:
[fit_options]
method = focei
maxiter = 500
covariance = true
global_search = true
SAEM with custom settings:
[fit_options]
method = saem
n_exploration = 200
n_convergence = 300
n_mh_steps = 5
seed = 42
covariance = true
FOCEI with SIR uncertainty:
[fit_options]
method = focei
covariance = true
sir = true
sir_samples = 1000
sir_resamples = 250
sir_seed = 42
Simulation
The optional [simulation] block defines a virtual clinical trial design for generating simulated data. This is useful for model validation and testing.
Syntax
[simulation]
subjects = N
dose = AMOUNT
cmt = COMPARTMENT
seed = SEED
times = [t1, t2, t3, ...]
| Key | Description |
|---|---|
subjects | Number of virtual subjects to simulate |
dose | Dose amount administered to each subject |
cmt | Dosing compartment (1-indexed) |
seed | Random seed for reproducible simulations |
times | Observation time points |
Example
[simulation]
subjects = 100
dose = 100
cmt = 1
seed = 12345
times = [0.5, 1, 2, 4, 8, 12, 24]
This simulates 100 subjects, each receiving a dose of 100 units into compartment 1, with observations at 0.5, 1, 2, 4, 8, 12, and 24 hours.
Usage
Run a simulation-estimation study from the CLI:
ferx model.ferx --simulate
This will:
- Parse the model and
[simulation]block - Generate simulated data using the model's default parameters and random effects
- Fit the model to the simulated data
- Report parameter estimates and diagnostics
Simulation Process
- For each subject, random effects are sampled from \( \eta_i \sim N(0, \Omega) \)
- Individual parameters are computed using the
[individual_parameters]equations - Predictions are generated using the structural model
- Residual error is added: \( DV = IPRED + \epsilon \), where \( \epsilon \sim N(0, V) \)
- Observations below 0.001 are clipped to 0.001
Covariates in Simulation
Covariates can be specified per subject in the [simulation] block. Currently, the simulated population uses default covariate values (covariates are not randomized).
Estimation Methods
FeRx implements two families of estimation methods for nonlinear mixed effects models:
-
FOCE / FOCEI -- First-Order Conditional Estimation (with or without interaction). The workhorse of population PK, using nested optimization to find maximum likelihood estimates.
-
Gauss-Newton (BHHH) -- A fast alternative that exploits the nonlinear-least-squares structure of the FOCE objective. Converges in 10-30 iterations using the outer product of per-subject gradients as an approximate Hessian. Available as pure GN or a GN+FOCEI hybrid.
-
SAEM -- Stochastic Approximation Expectation-Maximization. Uses MCMC sampling for random effects, providing more robust convergence on complex models.
-
SIR -- Sampling Importance Resampling. An optional post-estimation step that provides non-parametric parameter uncertainty estimates (95% CIs), more robust than the asymptotic covariance matrix.
Quick Comparison
| Feature | FOCE/FOCEI | Gauss-Newton | SAEM |
|---|---|---|---|
| Random effect estimation | MAP (optimization) | MAP (optimization) | MCMC (sampling) |
| Convergence speed | Medium (~50-100 evals) | Fast (~10-30 iterations) | Slower (~400 iterations) |
| Local minima robustness | Can get stuck | Can get stuck | More robust |
| Gradient required | Yes (AD or FD) | Yes (FD, per-subject) | No (for E-step) |
| Stochastic | No | No | Yes |
| Best for | General use | Fast iteration, well-conditioned models | Complex models, many random effects |
Choosing a Method
Start with FOCE for standard 1- or 2-compartment models with 2-4 random effects. It is deterministic and well-understood.
Try Gauss-Newton (gn or gn_hybrid) when you want faster convergence during model development, or when FOCE is slow to converge.
Switch to SAEM when:
- FOCE fails to converge or produces implausible estimates
- The model has many random effects (>4)
- You suspect the FOCE solution is a local minimum
- The model has complex nonlinear random effect structure
All methods produce comparable results on well-behaved models. The final OFV from SAEM and GN is computed using the FOCE approximation, so AIC/BIC values are directly comparable.
FOCE / FOCEI
First-Order Conditional Estimation (FOCE) and its interaction variant (FOCEI) are the primary estimation methods in FeRx. They use a two-level nested optimization to find maximum likelihood estimates of population parameters.
Algorithm Overview
Outer Loop (Population Parameters)
The outer loop optimizes the population parameters \( \theta \), \( \Omega \), and \( \sigma \) by minimizing the population objective function value (OFV):
\[ \text{OFV} = -2 \log L = \sum_{i=1}^{N} \text{OFV}_i \]
Parameters are internally transformed for unconstrained optimization:
- Theta: Log-transformed (ensures positivity)
- Omega: Cholesky factorization (ensures positive-definiteness)
- Sigma: Log-transformed (ensures positivity)
Inner Loop (Empirical Bayes Estimates)
For each subject, the inner loop finds the empirical Bayes estimate (EBE) of the random effects \( \hat{\eta}_i \) by minimizing the individual negative log-likelihood:
\[ -\log p(\eta_i | y_i, \theta, \Omega, \sigma) = \frac{1}{2} \left[ \eta_i^T \Omega^{-1} \eta_i + \log|\Omega| + \sum_j \left( \frac{(y_{ij} - f_{ij})^2}{V_{ij}} + \log V_{ij} \right) \right] \]
The inner loop uses BFGS with automatic differentiation gradients, falling back to Nelder-Mead simplex if BFGS fails.
FOCE vs FOCEI
Standard FOCE
Uses linearized predictions around the EBEs:
\[ f_0 = f(\hat{\eta}) - H \hat{\eta} \]
where \( H \) is the Jacobian matrix \( \partial f / \partial \eta \). The per-subject objective is:
\[ \text{OFV}_i = (y - f_0)^T \tilde{R}^{-1} (y - f_0) + \log|\tilde{R}| \]
where \( \tilde{R} = H \Omega H^T + R(f_0) \).
FOCEI (Interaction)
Uses individual predictions directly without linearization:
\[ \text{OFV}_i = (y - \hat{f})^T V^{-1} (y - \hat{f}) + \hat{\eta}^T \Omega^{-1} \hat{\eta} + \log|\tilde{R}| \]
FOCEI is more accurate when the residual variance depends on the predicted value (proportional or combined error models), because it captures the interaction between random effects and residual error.
Optimizer Options
NLopt Algorithms (Recommended)
| Algorithm | Key | Description |
|---|---|---|
| SLSQP | slsqp | Sequential Least Squares Programming. Handles bounds well. Default and recommended. |
| L-BFGS | nlopt_lbfgs | Limited-memory BFGS. Good for large parameter spaces. |
| MMA | mma | Method of Moving Asymptotes. Alternative constrained optimizer. |
Built-in Algorithms
| Algorithm | Key | Description |
|---|---|---|
| BFGS | bfgs | Quasi-Newton with backtracking line search. |
| L-BFGS | lbfgs | Memory-efficient BFGS variant. |
Global Search
Enable gradient-free pre-search to help escape local minima:
[fit_options]
method = foce
global_search = true
global_maxeval = 2000
The pre-search uses NLopt's CRS2-LM algorithm (Controlled Random Search) to explore the parameter space before handing off to the local optimizer. The number of evaluations auto-scales with model complexity if global_maxeval is not set.
Covariance Step
When covariance = true, FeRx computes the variance-covariance matrix of the parameter estimates using a finite-difference Hessian at the converged solution. This provides:
- Standard errors (SE) for all parameters
- Relative standard errors (%RSE) for assessing estimation precision
- Omega SEs via delta method from the Cholesky parameterization
Convergence
The outer loop terminates when any of:
- The gradient norm falls below
outer_gtol(default 1e-6) - The maximum number of iterations (
maxiter) is reached - The optimizer reports convergence (NLopt
XtolReachedorFtolReached)
The inner loop terminates when the gradient norm falls below inner_tol (default 1e-8) or inner_maxiter (default 200) iterations are reached.
Warm Starting
The inner loop warm-starts EBE estimation from the previous outer iteration's EBEs. This significantly reduces computation time, especially in later iterations when parameters change slowly.
Configuration Example
[fit_options]
method = focei
maxiter = 500
covariance = true
optimizer = slsqp
Gauss-Newton (BHHH)
The Gauss-Newton optimizer is an alternative to the standard FOCE/FOCEI approach that exploits the nonlinear-least-squares structure of the FOCE objective. It uses the BHHH (Berndt-Hall-Hall-Hausman) approximation to the Hessian and typically converges in 10-30 iterations, compared to 100+ evaluations for gradient-based methods like SLSQP.
This approach mirrors NONMEM's modified Gauss-Newton algorithm.
Variants
| Method | Key | Description |
|---|---|---|
| Pure Gauss-Newton | gn | Fast convergence for well-conditioned problems |
| GN + FOCEI hybrid | gn_hybrid | Runs GN first, then polishes with FOCEI for robustness |
Algorithm Overview
BHHH Hessian Approximation
The FOCE objective has the form:
\[ \text{OFV} = 2 \sum_{i=1}^{N} \text{NLL}_i \]
The BHHH approximation uses the outer product of per-subject gradients as an approximate Hessian:
\[ H_{\text{BHHH}} = 4 \sum_{i=1}^{N} g_i , g_i^T \]
where \( g_i = \nabla_x \text{NLL}_i \) is the gradient of the negative log-likelihood for subject \( i \) with respect to the packed parameter vector. Per-subject gradients are computed via central finite differences.
This approximation is equivalent to the Gauss-Newton Hessian for log-likelihood objectives and is guaranteed positive semi-definite.
Levenberg-Marquardt Damping
To improve stability away from the minimum, the Hessian is regularized:
\[ H_{\text{LM}} = H_{\text{BHHH}} + \lambda , \text{diag}(H_{\text{BHHH}}) \]
The damping factor \( \lambda \) is adapted during optimization:
- On successful step (OFV decreases): \( \lambda \leftarrow 0.3 , \lambda \) (minimum \( 10^{-6} \))
- On rejected step (OFV increases): \( \lambda \leftarrow 10 , \lambda \)
- If \( \lambda > 10^6 \), the algorithm stops
The Newton step is obtained by solving:
\[ H_{\text{LM}} , \delta = -\nabla \text{OFV} \]
via Cholesky decomposition. If the Cholesky fails (singular Hessian), \( \lambda \) is increased and the iteration is retried.
Line Search
Each step uses backtracking line search (up to 15 halvings). At each trial point, EBEs are re-estimated via the inner loop (warm-started from the previous EBEs), ensuring the FOCE objective is evaluated consistently.
Convergence
The algorithm converges when:
- Relative OFV change \( < 10^{-6} \) for at least 3 iterations, or
- Maximum iterations are reached
Hybrid Mode (GN + FOCEI)
When method = gn_hybrid, the algorithm runs in two phases:
- GN phase: Run Gauss-Newton iterations to quickly find the basin of the minimum
- FOCEI polish: Run up to 100 iterations of standard FOCEI optimization (via NLopt SLSQP) warm-started from the GN result
The final result is whichever phase produced the lower OFV. This combines the fast convergence of Gauss-Newton with the refined accuracy of FOCEI.
Configuration
[fit_options]
method = gn # or gn_hybrid
maxiter = 100 # max GN iterations
covariance = true
The initial Levenberg-Marquardt damping factor defaults to 0.01 and adapts automatically.
When to Use
Use gn when:
- You want fast convergence and the model is well-conditioned
- You are iterating on model development and need quick feedback
Use gn_hybrid when:
- You want robustness against local minima
- The model is complex (many parameters or random effects)
- You want the speed of GN with a safety net
Stick with foce/focei when:
- The model has convergence issues with GN (rare)
- You need exact compatibility with standard FOCE/FOCEI results
SAEM
Stochastic Approximation Expectation-Maximization (SAEM) is an alternative estimation method that uses MCMC sampling for random effects instead of MAP optimization. It is more robust to local minima and can handle complex random effect structures.
Algorithm Overview
SAEM replaces the deterministic inner loop of FOCE with stochastic sampling, following the Monolix convention with a two-phase step-size schedule.
References
- Delyon, Lavielle, Moulines (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, 94--128.
- Kuhn & Lavielle (2004). Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: Probability and Statistics 8:115--131.
Two-Phase Schedule
Phase 1: Exploration (iterations 1 to K1)
Step size \( \gamma_k = 1 \). The algorithm explores the parameter space rapidly, with the sufficient statistics being fully replaced each iteration. This allows fast movement toward the basin of the MLE.
Default: 150 iterations.
Phase 2: Convergence (iterations K1+1 to K1+K2)
Step size \( \gamma_k = 1/(k - K_1) \). The algorithm performs a decreasing-weight average, which guarantees almost-sure convergence to the MLE under regularity conditions.
Default: 250 iterations.
Per-Iteration Steps
Each SAEM iteration consists of:
1. E-Step: Metropolis-Hastings Sampling
For each subject, run n_mh_steps Metropolis-Hastings iterations to sample from the conditional distribution of random effects:
\[ p(\eta_i | y_i, \theta, \Omega, \sigma) \]
Proposal: \( \eta_{\text{prop}} = \eta_{\text{current}} + \delta_i \cdot L \cdot z \), where \( z \sim N(0, I) \) and \( L = \text{chol}(\Omega) \).
Acceptance: Accept with probability \( \min(1, \exp(\text{NLL}{\text{current}} - \text{NLL}{\text{prop}})) \).
The MH sampling is parallelized across subjects using Rayon.
2. Stochastic Approximation Update
Update the sufficient statistic for \( \Omega \):
\[ S_2 \leftarrow (1 - \gamma_k) \cdot S_2 + \gamma_k \cdot \frac{1}{N} \sum_{i=1}^{N} \eta_i \eta_i^T \]
3. M-Step for Omega (Closed Form)
\[ \Omega_k = S_2 \]
4. M-Step for Theta and Sigma (Optimization)
Minimize the conditional observation negative log-likelihood with ETAs held fixed:
\[ \sum_{i=1}^{N} \sum_{j=1}^{n_i} \left[ \frac{1}{2} \log V_{ij} + \frac{1}{2} \frac{(y_{ij} - f_{ij})^2}{V_{ij}} \right] \]
This is optimized over \( \theta \) and \( \sigma \) in log-space using NLopt SLSQP with finite-difference gradients.
5. Adaptive MH Step Sizes
Every adapt_interval iterations, the per-subject step sizes \( \delta_i \) are adjusted:
- If acceptance rate > 40%: increase \( \delta_i \) by 10% (up to 5.0)
- If acceptance rate < 40%: decrease \( \delta_i \) by 10% (down to 0.01)
This targets an acceptance rate around 40%, balancing exploration and mixing.
Post-SAEM Finalization
After the SAEM iterations complete:
- EBE Refinement: Run the standard FOCE inner loop (BFGS optimization) warm-started from the SAEM ETAs to obtain final empirical Bayes estimates
- FOCE OFV: Compute the objective function using the FOCE/Laplace approximation, so AIC and BIC are directly comparable with FOCE results
- Covariance Step: Optionally compute standard errors via finite-difference Hessian (same method as FOCE)
- Diagnostics: Compute PRED, IPRED, CWRES, IWRES for each subject
Configuration
[fit_options]
method = saem
n_exploration = 150 # Phase 1 iterations
n_convergence = 250 # Phase 2 iterations
n_mh_steps = 3 # MH steps per subject per iteration
adapt_interval = 50 # Step-size adaptation frequency
seed = 12345 # RNG seed for reproducibility
covariance = true # Compute standard errors
Tuning Guide
Not Converging
- Increase
n_exploration(e.g., 300) to give more time for basin finding - Increase
n_convergence(e.g., 500) for a longer averaging window - Increase
n_mh_steps(e.g., 5-10) for better mixing in the E-step
Slow Convergence
- Decrease
n_explorationandn_convergenceif parameters stabilize early - Use
adapt_interval = 25for faster step-size adaptation
Reproducibility
- Always set
seedfor reproducible results - Different seeds will produce slightly different estimates due to the stochastic nature of the algorithm
Output
The SAEM iteration progress is printed to stderr:
SAEM: 10 subjects, 3 ETAs, 400 total iter (150 explore + 250 converge)
SAEM iter 1/400 [explore] gamma=1.000 condNLL=95.244
SAEM iter 50/400 [explore] gamma=1.000 condNLL=56.705
SAEM iter 150/400 [explore] gamma=1.000 condNLL=46.071
SAEM iter 200/400 [converge] gamma=0.020 condNLL=36.799
SAEM iter 400/400 [converge] gamma=0.004 condNLL=38.096
SAEM iterations complete. Computing final EBEs and OFV...
The condNLL is the conditional observation negative log-likelihood (not the final OFV). It should generally decrease during the exploration phase and stabilize during convergence.
Sampling Importance Resampling (SIR)
SIR is an optional post-estimation step that provides non-parametric parameter uncertainty estimates. It produces 95% confidence intervals that are more robust than the asymptotic covariance matrix, particularly for models with:
- Non-normal parameter distributions
- Boundary estimates (parameters near constraints)
- Small datasets where asymptotic assumptions may not hold
How It Works
SIR uses the maximum likelihood estimates and their covariance matrix as a proposal distribution, then reweights samples based on the actual likelihood:
- Sample: Draw M parameter vectors from a multivariate normal distribution centered on the ML estimates, using the estimation covariance matrix
- Importance weighting: For each sample, compute the objective function value (OFV) and calculate an importance weight based on the ratio of the true likelihood to the proposal density
- Resample: Draw m vectors (with replacement) proportional to the importance weights
The resampled vectors approximate the true parameter uncertainty distribution. Confidence intervals are derived from their empirical percentiles.
Enabling SIR
Add sir = true to the [fit_options] block. The covariance step must also be enabled (it provides the proposal distribution):
[fit_options]
method = focei
covariance = true
sir = true
Options
| Key | Default | Description |
|---|---|---|
sir | false | Enable/disable SIR |
sir_samples | 1000 | Number of proposal samples (M). Higher values give more reliable weights but take longer |
sir_resamples | 250 | Number of resampled vectors (m). Must be less than sir_samples |
sir_seed | 12345 | RNG seed for reproducibility |
Output
SIR adds the following to the estimation output:
- 95% CI for each theta, omega, and sigma parameter (2.5th and 97.5th percentiles)
- Effective sample size (ESS): a diagnostic indicating how well the proposal distribution matches the true uncertainty. ESS close to M indicates a good match; ESS much less than m suggests the proposal is a poor fit
Diagnostics
The effective sample size (ESS) is the primary diagnostic:
- ESS > m (resamples): excellent — the proposal distribution is well-matched
- ESS between 100 and m: adequate for most purposes
- ESS < 100: the proposal may be a poor fit; consider a different estimation method or increasing
sir_samples
Computational Cost
SIR evaluates the inner loop (EBE optimization) for each of the M proposal samples. With the default M=1000, this is roughly 3-10x the cost of the estimation step itself. The computation is parallelized across samples and warm-started from the ML EBEs to minimize runtime.
The resampling step itself is negligible.
Example
[fit_options]
method = focei
covariance = true
sir = true
sir_samples = 1000
sir_resamples = 250
sir_seed = 42
Reference
Dosne A-G, Bergstrand M, Karlsson MO. "Improving the estimation of parameter uncertainty distributions in nonlinear mixed effects models using sampling importance resampling." J Pharmacokinet Pharmacodyn. 2017;44(6):539-562. doi:10.1007/s10928-017-9542-0
Data Format
FeRx reads data in NONMEM-compatible CSV format. This is the standard format used across population PK tools.
Required Columns
| Column | Type | Description |
|---|---|---|
ID | string/numeric | Subject identifier |
TIME | numeric | Time relative to first dose |
DV | numeric | Dependent variable (observed concentration) |
Optional Standard Columns
| Column | Type | Default | Description |
|---|---|---|---|
EVID | integer | 0 | Event ID: 0 = observation, 1 = dose, 4 = reset + dose |
AMT | numeric | 0 | Dose amount (only for EVID=1 or EVID=4) |
CMT | integer | 1 | Compartment number (1-indexed) |
RATE | numeric | 0 | Infusion rate. 0 = bolus dose |
MDV | integer | 0 | Missing DV flag. 1 = DV should be ignored |
II | numeric | 0 | Interdose interval for repeated dosing |
SS | integer | 0 | Steady-state flag. 1 = assume steady state |
CENS | integer | 0 | Censoring flag. 1 = observation is below LLOQ; DV carries the LLOQ value. Paired with bloq_method = m3 in [fit_options] to enable likelihood-based handling — see BLOQ example. |
Covariate Columns
Any column not in the standard set above is automatically treated as a covariate. Covariate values are:
- Time-constant: The first non-missing value for each subject is used
- Time-varying: If values change over time for a subject, Last Observation Carried Forward (LOCF) is applied
Covariate names in the data file are matched case-insensitively to names used in [individual_parameters] expressions.
Event Types (EVID)
| EVID | Meaning |
|---|---|
| 0 | Observation record. DV is used for estimation. |
| 1 | Dosing record. AMT is administered to compartment CMT. |
| 4 | Reset and dose. All compartment amounts are reset to zero before dosing. |
Example Dataset
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,WT,CRCL
1,0,.,1,100,1,0,1,70,95
1,0.5,9.49,0,.,.,.,0,70,95
1,1,14.42,0,.,.,.,0,70,95
1,2,17.56,0,.,.,.,0,70,95
1,4,15.23,0,.,.,.,0,70,95
1,8,10.15,0,.,.,.,0,70,95
2,0,.,1,150,1,0,1,85,110
2,0.5,14.2,0,.,.,.,0,85,110
2,1,21.3,0,.,.,.,0,85,110
Key points:
- Dose records (
EVID=1) haveMDV=1andDV=.(missing) - Observation records (
EVID=0) haveMDV=0and a validDV - Covariates (
WT,CRCL) are included as extra columns - Missing values can be represented as
.or left empty
Infusion Doses
For IV infusions, set RATE to the infusion rate (amount per time unit):
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV
1,0,.,1,500,1,50,1
This administers 500 units at a rate of 50 units/hour (duration = 10 hours).
Steady-State Dosing
For steady-state simulations, set SS=1 and II to the dosing interval:
ID,TIME,DV,EVID,AMT,CMT,SS,II,MDV
1,0,.,1,100,1,1,12,1
1,0.5,25.3,0,.,.,.,.,0
This assumes the subject has reached steady state with 100 units every 12 hours before the observation at TIME=0.5.
Multiple Doses
Multiple doses are supported as separate rows:
ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,12,.,1,100,1,1
1,12.5,15.2,0,.,.,0
1,24,.,1,100,1,1
1,24.5,18.1,0,.,.,0
Column Name Case
Column names are case-insensitive. ID, Id, and id are all recognized. Covariate columns preserve their case as declared in the CSV header.
CLI Reference
The ferx command-line tool runs population PK estimation from model files and data.
Usage
ferx <model.ferx> --data <data.csv>
ferx <model.ferx> --simulate
Commands
Fit with Data
ferx model.ferx --data data.csv
Parses the model file, reads the data, and runs the estimation method specified in [fit_options] (defaults to FOCE).
Simulate and Fit
ferx model.ferx --simulate
Parses the model file, generates simulated data from the [simulation] block, and fits the model to the simulated data. Requires a [simulation] block in the model file.
Output Files
Three files are generated, named after the model file:
| File | Contents |
|---|---|
{model}-sdtab.csv | Per-observation diagnostics |
{model}-fit.yaml | Parameter estimates and standard errors |
{model}-timing.txt | Wall-clock estimation time |
See Output Files for detailed format descriptions.
Console Output
Progress
The estimation progress is printed to stderr, including:
- Model and data summary (subjects, observations, parameters)
- Optimizer iterations with OFV values (FOCE) or condNLL values (SAEM)
- Covariance step status
- Final parameter table
Result Summary
A brief summary is printed to stdout:
Fit completed!
OFV: -280.1838
Elapsed: 0.496s
TVCL = 0.132735
TVV = 7.694842
TVKA = 0.757498
Exit Codes
| Code | Meaning |
|---|---|
| 0 | Success |
| 1 | Error (parse failure, data error, convergence failure) |
Examples
# One-compartment oral warfarin model
ferx examples/warfarin.ferx --data data/warfarin.csv
# Two-compartment IV with FOCE
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv
# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv
# Simulation-estimation study
ferx examples/warfarin.ferx --simulate
Building
# Build the ferx binary
cargo build --release --features autodiff
# Run directly via cargo
cargo run --release --features autodiff --bin ferx -- model.ferx --data data.csv
Output Files
Each model run produces three output files.
sdtab CSV ({model}-sdtab.csv)
A CSV file with per-observation diagnostics, one row per observation per subject.
Columns
| Column | Description |
|---|---|
ID | Subject identifier |
TIME | Observation time |
DV | Observed value |
PRED | Population prediction (eta = 0) |
IPRED | Individual prediction (eta = EBE) |
CWRES | Conditional weighted residual |
IWRES | Individual weighted residual |
ETA1, ETA2, ... | Empirical Bayes estimates of random effects |
Residual Definitions
IWRES (Individual Weighted Residual): \[ \text{IWRES}_j = \frac{y_j - \text{IPRED}_j}{\sqrt{V_j}} \] where \( V_j \) is the residual variance evaluated at the individual prediction.
CWRES (Conditional Weighted Residual): \[ \text{CWRES}j = \frac{y_j - f{0,j}}{\sqrt{\tilde{R}_{jj}}} \] where \( f_0 = f(\hat{\eta}) - H\hat{\eta} \) is the linearized population prediction and \( \tilde{R} = H\Omega H^T + R \) is the conditional variance.
Example
ID,TIME,DV,PRED,IPRED,CWRES,IWRES,ETA1,ETA2,ETA3
1,0.5,9.49,10.12,9.55,-0.23,-0.06,0.15,-0.08,0.32
1,1.0,14.42,14.87,14.35,0.18,0.05,0.15,-0.08,0.32
Fit YAML ({model}-fit.yaml)
A YAML file containing parameter estimates, standard errors, and model diagnostics.
Structure
model:
converged: true
method: FOCE
objective_function:
ofv: -280.1838
aic: -266.1838
bic: -247.2804
data:
n_subjects: 10
n_observations: 110
n_parameters: 7
theta:
TVCL:
estimate: 0.132735
se: 0.014549
rse_pct: 11.0
TVV:
estimate: 7.694842
se: 0.293028
rse_pct: 3.8
omega:
omega_11:
variance: 0.028584
cv_pct: 16.9
se: 0.006394
omega_22:
variance: 0.009613
cv_pct: 9.8
se: 0.002165
sigma:
sigma_1:
estimate: 0.010638
se: 0.000788
Key Fields
- ofv: Objective Function Value (-2 log-likelihood)
- aic: Akaike Information Criterion (OFV + 2p)
- bic: Bayesian Information Criterion (OFV + p * ln(n))
- se: Standard error from the covariance step
- rse_pct: Relative standard error as percentage (SE/estimate * 100)
- cv_pct: Coefficient of variation for omega (sqrt(variance) * 100)
Timing File ({model}-timing.txt)
A single-line text file with the wall-clock estimation time:
elapsed_seconds=0.496000
This measures only the estimation step (not parsing or data reading).
Examples
FeRx includes several example models in the examples/ directory with corresponding datasets in data/.
| Example | Model | Route | Features |
|---|---|---|---|
| One-Compartment Oral | 1-cpt | Oral | Basic PopPK, FOCE and SAEM |
| Two-Compartment IV | 2-cpt | IV bolus | Multi-compartment dynamics |
| Covariates | 2-cpt | Oral | Weight and renal function effects |
| ODE Model | 1-cpt | Oral | Michaelis-Menten elimination, ODE solver |
| BLOQ (M3 method) | 1-cpt | Oral | Likelihood-based handling of censored observations |
Running Examples
# One-compartment oral (warfarin)
ferx examples/warfarin.ferx --data data/warfarin.csv
# Two-compartment IV
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv
# Covariates model
ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv
# ODE model (Michaelis-Menten)
ferx examples/mm_oral.ferx --data data/mm_oral.csv
# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv
# BLOQ (M3 method)
ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv
One-Compartment Oral (Warfarin)
This example fits a one-compartment oral pharmacokinetic model to warfarin concentration data.
Model File (warfarin.ferx)
# One-compartment oral PK model (warfarin)
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
maxiter = 300
covariance = true
Model Description
- Structure: One-compartment model with first-order absorption (KA) and first-order elimination (CL/V)
- Random effects: Log-normal on CL, V, and KA (exponential eta model)
- Error model: Proportional residual error
- Parameters:
- TVCL: Typical value of clearance (L/h)
- TVV: Typical value of volume of distribution (L)
- TVKA: Typical value of absorption rate constant (1/h)
Running
ferx examples/warfarin.ferx --data data/warfarin.csv
Expected Results
--- THETA Estimates ---
TVCL 0.132735
TVV 7.694842
TVKA 0.757498
--- OMEGA Estimates (variances) ---
OMEGA(1,1) = 0.028584 (CV% = 16.9)
OMEGA(2,2) = 0.009613 (CV% = 9.8)
OMEGA(3,3) = 0.340868 (CV% = 58.4)
--- SIGMA Estimates ---
SIGMA(1) = 0.010638
SAEM Variant
The same model can be run with SAEM by changing the [fit_options]:
[fit_options]
method = saem
n_exploration = 150
n_convergence = 250
n_mh_steps = 3
seed = 12345
covariance = true
ferx examples/warfarin_saem.ferx --data data/warfarin.csv
Two-Compartment IV Bolus
This example fits a two-compartment IV bolus model with four random effects.
Model File (two_cpt_iv.ferx)
# Two-compartment IV bolus PK model
[parameters]
theta TVCL(5.0, 0.01, 100.0)
theta TVV1(50.0, 0.1, 1000.0)
theta TVQ(10.0, 0.01, 200.0)
theta TVV2(100.0, 0.1, 5000.0)
omega ETA_CL ~ 0.09
omega ETA_V1 ~ 0.04
omega ETA_Q ~ 0.04
omega ETA_V2 ~ 0.09
sigma PROP_ERR ~ 0.04
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V1 = TVV1 * exp(ETA_V1)
Q = TVQ * exp(ETA_Q)
V2 = TVV2 * exp(ETA_V2)
[structural_model]
pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = foce
maxiter = 500
covariance = true
Model Description
- Structure: Two-compartment model with central (V1) and peripheral (V2) compartments connected by intercompartmental clearance (Q)
- Route: Intravenous bolus (dose goes directly into central compartment)
- Random effects: Log-normal on all four PK parameters
- Parameters:
- CL: Systemic clearance (L/h)
- V1: Central volume of distribution (L)
- Q: Intercompartmental clearance (L/h)
- V2: Peripheral volume of distribution (L)
The bi-exponential concentration profile is characterized by a rapid distribution phase (alpha) followed by a slower elimination phase (beta).
Running
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv
Notes
- Two-compartment models have more parameters and may need more iterations to converge
- The
global_search = trueoption can help if convergence is difficult - Consider reducing random effects (e.g., fixing Q or V2 variability) if the model is overparameterized
Covariate Model
This example demonstrates a two-compartment oral model with body weight (WT) and creatinine clearance (CRCL) as covariates on clearance.
Model File (two_cpt_oral_cov.ferx)
# Two-compartment oral PK model with covariates
[parameters]
theta TVCL(5.0, 0.01, 100.0)
theta TVV1(50.0, 0.1, 1000.0)
theta TVQ(10.0, 0.01, 200.0)
theta TVV2(100.0, 0.1, 5000.0)
theta TVKA(1.0, 0.01, 50.0)
theta THETA_WT(0.75, 0.01, 2.0)
theta THETA_CRCL(0.5, 0.01, 2.0)
omega ETA_CL ~ 0.09
omega ETA_V1 ~ 0.04
omega ETA_Q ~ 0.04
omega ETA_V2 ~ 0.09
omega ETA_KA ~ 0.25
sigma PROP_ERR ~ 0.04
[individual_parameters]
CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
V1 = TVV1 * exp(ETA_V1)
Q = TVQ * exp(ETA_Q)
V2 = TVV2 * exp(ETA_V2)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = focei
maxiter = 500
covariance = true
Covariate Effects
The clearance equation includes two covariate effects:
CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
- (WT/70)^THETA_WT: Allometric scaling of clearance with body weight, centered at 70 kg. THETA_WT is estimated (expected ~0.75 for CL).
- (CRCL/100)^THETA_CRCL: Renal function effect on clearance, centered at 100 mL/min. THETA_CRCL is estimated.
Data Requirements
The dataset must include WT and CRCL columns:
ID,TIME,DV,EVID,AMT,CMT,MDV,WT,CRCL
1,0,.,1,100,1,1,72.5,105
1,0.5,12.3,0,.,.,0,72.5,105
1,1.0,18.7,0,.,.,0,72.5,105
Covariate columns are automatically detected -- any column not in the standard NONMEM set is treated as a covariate. The names in the data file must match those used in [individual_parameters] (case-insensitive).
Running
ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv
Notes
- FOCEI is used here because the proportional error model creates an interaction between random effects and residual error
- Covariate centering (dividing by 70 for weight, 100 for CRCL) improves numerical stability and makes the typical value (TVCL) interpretable as the clearance for a 70 kg patient with CRCL of 100 mL/min
- The estimated covariate exponents (THETA_WT, THETA_CRCL) have standard errors that can be used to test significance
ODE Model (Michaelis-Menten)
This example demonstrates a one-compartment oral model with saturable (Michaelis-Menten) elimination, solved using the ODE integrator.
Model File (mm_oral.ferx)
# One-compartment oral model with Michaelis-Menten elimination
[parameters]
theta TVVMAX(10.0, 0.1, 1000.0)
theta TVKM(2.0, 0.01, 100.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)
omega ETA_VMAX ~ 0.09
omega ETA_V ~ 0.04
sigma PROP_ERR ~ 0.1
[individual_parameters]
VMAX = TVVMAX * exp(ETA_VMAX)
KM = TVKM
V = TVV * exp(ETA_V)
KA = TVKA
[structural_model]
ode(obs_cmt=central, states=[depot, central])
[odes]
d/dt(depot) = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = focei
maxiter = 500
covariance = true
Model Description
Why Use ODEs
Michaelis-Menten (saturable) elimination cannot be described by a simple exponential function. At high concentrations, elimination approaches a maximum rate (VMAX), leading to nonlinear pharmacokinetics. This requires numerical ODE integration.
ODE System
d/dt(depot) = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)
- depot: Absorption compartment. First-order absorption at rate KA.
- central: Central compartment. Drug enters from depot and is eliminated by Michaelis-Menten kinetics.
Parameters
- VMAX: Maximum elimination rate (amount/time)
- KM: Michaelis-Menten constant (concentration at half-maximal rate)
- V: Volume of distribution
- KA: First-order absorption rate constant
Note that only VMAX and V have random effects. KM and KA are treated as population parameters (no between-subject variability).
Structural Model Declaration
[structural_model]
ode(obs_cmt=central, states=[depot, central])
obs_cmt=central: Thecentralcompartment is observed (matched to DV in the data)states=[depot, central]: Two state variables (compartments)
Running
ferx examples/mm_oral.ferx --data data/mm_oral.csv
Notes
- ODE models are slower than analytical models because each prediction requires numerical integration
- The ODE solver uses adaptive step-size control (Dormand-Prince RK45) for accuracy
- For standard first-order kinetics, use the analytical
one_cpt_oralmodel instead -- it is much faster - The
centralcompartment in this model contains the drug amount. If you want concentration as the observed variable, divide by V in the ODE equations (as done here withKA * depot / V)
BLOQ (M3 method)
This example shows how to fit a model to data that contains observations below
the assay lower limit of quantification (BLOQ) using Beal's M3 method.
Instead of dropping BLOQ rows — which biases terminal-phase parameter estimates
— each censored observation contributes
P(y < LLOQ | θ, η) = Φ((LLOQ − f)/√V) to the likelihood.
Dataset (data/warfarin-bloq.csv)
This is the warfarin dataset with an added CENS column. Ten late-time
observations that originally fell below an assay LLOQ of 2.0 µg/mL have been
marked CENS=1 and their DV cell set to the LLOQ value:
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,CENS
1,0,.,1,100,1,0,1,0
1,0.5,5.3653,0,.,1,0,0,0
...
1,96,2.5019,0,.,1,0,0,0
1,120,2,0,.,1,0,0,1
NONMEM CENS convention: when CENS=1, the row is censored and DV
carries the LLOQ value — not the true (unobserved) concentration. Rows with no
CENS column, or CENS=0, are treated as ordinary quantified observations.
Model File (examples/warfarin-bloq.ferx)
# One-compartment oral PK model (warfarin) with M3 BLOQ likelihood.
[parameters]
theta TVCL(0.2, 0.001, 10.0)
theta TVV(10.0, 0.1, 500.0)
theta TVKA(1.5, 0.01, 50.0)
omega ETA_CL ~ 0.09
omega ETA_V ~ 0.04
omega ETA_KA ~ 0.30
sigma PROP_ERR ~ 0.02
[individual_parameters]
CL = TVCL * exp(ETA_CL)
V = TVV * exp(ETA_V)
KA = TVKA * exp(ETA_KA)
[structural_model]
pk one_cpt_oral(cl=CL, v=V, ka=KA)
[error_model]
DV ~ proportional(PROP_ERR)
[fit_options]
method = focei
maxiter = 300
covariance = true
bloq_method = m3
The only two lines that differ from a plain fit are bloq_method = m3 in
[fit_options] and the added CENS column in the CSV. Nothing else in the
model changes.
Running
ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv
Expected Results
--- Objective Function ---
OFV: -213.9104
--- THETA Estimates ---
TVCL 0.126263
TVV 7.629401
TVKA 1.083302
--- OMEGA Estimates ---
OMEGA(1,1) = 0.031381 (CV% = 17.7)
OMEGA(2,2) = 0.009723 (CV% = 9.9)
OMEGA(3,3) = 0.419327 (CV% = 64.8)
--- SIGMA Estimates ---
SIGMA(1) = 0.010766
The diagnostic table (warfarin-bloq-sdtab.csv) gains a CENS column, and the
IWRES / CWRES cells for censored rows are written as empty (a weighted
Gaussian residual is undefined when the observed value is censored).
Method Notes
- Activation requires both pieces: the
CENScolumn in the data file andbloq_method = m3in[fit_options]. Without the option,CENS=1rows are treated as ordinary observations at the LLOQ value, which biases the fit. - FOCE is auto-promoted to FOCEI on affected subjects. Mixing linearized
Gaussian residuals with non-linearized
log Φterms produces inconsistent OFVs near the LLOQ boundary, so whenmethod = foceand a subject has anyCENS=1row, that subject is evaluated with η-interaction. A notice is written toFitResult.warnings; setmethod = foceiexplicitly to silence it. - Gauss-Newton caveat. With
method = gnorgn_hybrid, the BHHH information-matrix approximation degrades as the BLOQ fraction grows (each censored row carries less Fisher information than its Gaussian counterpart). A warning is emitted; for >20% censoring preferfocei. - SAEM optimizes θ/σ with M3 in the M-step directly, so no special handling
is required beyond setting
bloq_method = m3.
Rust API
FeRx-NLME can be used as a Rust library for embedding population PK estimation in your own applications.
Adding as a Dependency
[dependencies]
ferx-nlme = { path = "../ferx-nlme", features = ["autodiff"] }
Quick Example
use ferx_nlme::*; use std::path::Path; fn main() -> Result<(), String> { // Parse model and data let parsed = parse_full_model_file(Path::new("model.ferx"))?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; // Build initial parameters and options let (init_params, options) = build_fit_inputs(&parsed)?; // Run estimation let result = fit(&parsed.model, &population, &init_params, &options)?; // Access results println!("OFV: {}", result.ofv); for (name, val) in result.theta_names.iter().zip(result.theta.iter()) { println!(" {} = {:.6}", name, val); } Ok(()) }
API Sections
- Core Types --
CompiledModel,Population,FitResult,FitOptions, etc. - Fitting Functions --
fit(),fit_from_files(),run_model_with_data() - Simulation --
simulate(),simulate_with_seed(),predict() - Parsing --
parse_model_file(),parse_full_model_file(),read_nonmem_csv()
Core Types
The main types exported by ferx_nlme. All types are available via use ferx_nlme::*.
CompiledModel
A parsed and compiled model ready for estimation. Created by parse_model_file() or parse_full_model_file().
#![allow(unused)] fn main() { pub struct CompiledModel { pub name: String, pub pk_model: PkModel, pub error_model: ErrorModel, pub pk_param_fn: PkParamFn, // (theta, eta, covariates) -> PkParams pub n_theta: usize, pub n_eta: usize, pub n_epsilon: usize, pub theta_names: Vec<String>, pub eta_names: Vec<String>, pub default_params: ModelParameters, pub tv_fn: Option<...>, // Typical values function (for AD) pub pk_indices: Vec<usize>, // Maps eta index -> PK parameter index pub ode_spec: Option<OdeSpec>, // ODE specification (if ODE model) } }
Key fields:
pk_param_fn: Closure that maps (theta, eta, covariates) to PK parameters. Generated by the parser from[individual_parameters].default_params: Initial parameter values from[parameters]block.ode_spec: Present only for ODE models;Nonefor analytical models.
Population and Subject
#![allow(unused)] fn main() { pub struct Population { pub subjects: Vec<Subject>, pub covariate_names: Vec<String>, pub dv_column: String, } pub struct Subject { pub id: String, pub doses: Vec<DoseEvent>, pub obs_times: Vec<f64>, pub observations: Vec<f64>, pub obs_cmts: Vec<usize>, pub covariates: HashMap<String, f64>, // Time-constant pub tvcov: HashMap<String, Vec<f64>>, // Time-varying (LOCF) } }
ModelParameters
#![allow(unused)] fn main() { pub struct ModelParameters { pub theta: Vec<f64>, pub theta_names: Vec<String>, pub theta_lower: Vec<f64>, pub theta_upper: Vec<f64>, pub omega: OmegaMatrix, pub sigma: SigmaVector, } }
OmegaMatrix
Between-subject variability matrix with pre-computed Cholesky factor.
#![allow(unused)] fn main() { pub struct OmegaMatrix { pub matrix: DMatrix<f64>, // Full omega matrix pub chol: DMatrix<f64>, // Lower Cholesky factor L (Omega = L*L') pub eta_names: Vec<String>, pub diagonal: bool, // True if omega is diagonal } }
Constructors:
OmegaMatrix::from_matrix(m, names, diagonal)-- From a full matrix (computes Cholesky, regularizes if needed)OmegaMatrix::from_diagonal(variances, names)-- From diagonal variances
FitOptions
#![allow(unused)] fn main() { pub struct FitOptions { pub method: EstimationMethod, // Foce, FoceI, or Saem pub outer_maxiter: usize, // Default: 500 pub outer_gtol: f64, // Default: 1e-6 pub inner_maxiter: usize, // Default: 200 pub inner_tol: f64, // Default: 1e-8 pub run_covariance_step: bool, // Default: true pub interaction: bool, // FOCEI flag pub verbose: bool, // Default: true pub optimizer: Optimizer, // Default: Slsqp pub lbfgs_memory: usize, // Default: 5 pub global_search: bool, // Default: false pub global_maxeval: usize, // Default: 0 (auto) // SAEM-specific pub saem_n_exploration: usize, // Default: 150 pub saem_n_convergence: usize, // Default: 250 pub saem_n_mh_steps: usize, // Default: 3 pub saem_adapt_interval: usize, // Default: 50 pub saem_seed: Option<u64>, // Default: None } }
Use FitOptions::default() for standard FOCE settings.
FitResult
#![allow(unused)] fn main() { pub struct FitResult { pub method: EstimationMethod, pub converged: bool, pub ofv: f64, pub aic: f64, pub bic: f64, pub theta: Vec<f64>, pub theta_names: Vec<String>, pub omega: DMatrix<f64>, pub sigma: Vec<f64>, pub covariance_matrix: Option<DMatrix<f64>>, pub se_theta: Option<Vec<f64>>, pub se_omega: Option<Vec<f64>>, pub se_sigma: Option<Vec<f64>>, pub subjects: Vec<SubjectResult>, pub n_obs: usize, pub n_subjects: usize, pub n_parameters: usize, pub n_iterations: usize, pub interaction: bool, pub warnings: Vec<String>, } }
SubjectResult
Per-subject diagnostics from the fit.
#![allow(unused)] fn main() { pub struct SubjectResult { pub id: String, pub eta: DVector<f64>, pub ipred: Vec<f64>, // Individual predictions pub pred: Vec<f64>, // Population predictions pub iwres: Vec<f64>, // Individual weighted residuals pub cwres: Vec<f64>, // Conditional weighted residuals pub ofv_contribution: f64, } }
Enums
#![allow(unused)] fn main() { pub enum EstimationMethod { Foce, FoceI, Saem } pub enum PkModel { OneCptIvBolus, OneCptOral, OneCptInfusion, TwoCptIvBolus, TwoCptOral, TwoCptInfusion } pub enum ErrorModel { Additive, Proportional, Combined } pub enum Optimizer { Bfgs, Lbfgs, Slsqp, NloptLbfgs, Mma } }
DoseEvent
#![allow(unused)] fn main() { pub struct DoseEvent { pub time: f64, pub amt: f64, pub cmt: usize, pub rate: f64, pub duration: f64, pub ss: bool, pub ii: f64, } }
Fitting Functions
fit()
The primary estimation entry point. Runs FOCE, FOCEI, or SAEM depending on options.method.
#![allow(unused)] fn main() { pub fn fit( model: &CompiledModel, population: &Population, init_params: &ModelParameters, options: &FitOptions, ) -> Result<FitResult, String> }
Parameters:
model: Compiled model fromparse_model_file()orparse_full_model_file()population: Population data fromread_nonmem_csv()init_params: Initial parameter valuesoptions: Estimation configuration
Returns: FitResult with parameter estimates, standard errors, and per-subject diagnostics.
Example:
#![allow(unused)] fn main() { let model = parse_model_file(Path::new("model.ferx"))?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; let options = FitOptions::default(); let result = fit(&model, &population, &model.default_params, &options)?; println!("OFV: {:.4}", result.ofv); }
fit_from_files()
Convenience wrapper that handles parsing and data reading.
#![allow(unused)] fn main() { pub fn fit_from_files( model_path: &str, data_path: &str, covariate_columns: Option<&[&str]>, options: Option<FitOptions>, ) -> Result<FitResult, String> }
Example:
#![allow(unused)] fn main() { let result = fit_from_files( "model.ferx", "data.csv", None, // Auto-detect covariates None, // Default options )?; }
run_model_with_data()
Full pipeline: parse model file, read data, fit. Returns both the fit result and the population.
#![allow(unused)] fn main() { pub fn run_model_with_data( model_path: &str, data_path: &str, ) -> Result<(FitResult, Population), String> }
Uses the [fit_options] from the model file.
run_model_simulate()
Simulation-estimation: parse model, generate data from [simulation] block, fit.
#![allow(unused)] fn main() { pub fn run_model_simulate( model_path: &str, ) -> Result<(FitResult, Population), String> }
Requires a [simulation] block in the model file.
build_fit_inputs()
Extract initial parameters and fit options from a parsed model, separating parsing from estimation for timing purposes.
#![allow(unused)] fn main() { pub fn build_fit_inputs( parsed: &ParsedModel, ) -> Result<(ModelParameters, FitOptions), String> }
Example:
#![allow(unused)] fn main() { let parsed = parse_full_model_file(Path::new("model.ferx"))?; let (init_params, options) = build_fit_inputs(&parsed)?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; let result = fit(&parsed.model, &population, &init_params, &options)?; }
Simulation
simulate()
Generate simulated observations from a model with random effects and residual error.
#![allow(unused)] fn main() { pub fn simulate( model: &CompiledModel, population: &Population, params: &ModelParameters, n_sim: usize, ) -> Vec<SimulationResult> }
Parameters:
model: Compiled modelpopulation: Template population (dose events and observation times are used; DV values are ignored)params: True parameter values for simulationn_sim: Number of simulation replicates
Returns: Vector of SimulationResult, one per observation per subject per replicate.
Example:
#![allow(unused)] fn main() { let model = parse_model_file(Path::new("model.ferx"))?; let population = read_nonmem_csv(Path::new("data.csv"), None)?; // Simulate 1000 replicates let sims = simulate(&model, &population, &model.default_params, 1000); for sim in &sims[..5] { println!("Sim {}, ID {}, TIME {}, IPRED {:.3}, DV {:.3}", sim.sim, sim.id, sim.time, sim.ipred, sim.dv_sim); } }
simulate_with_seed()
Same as simulate() but with a fixed random seed for reproducibility.
#![allow(unused)] fn main() { pub fn simulate_with_seed( model: &CompiledModel, population: &Population, params: &ModelParameters, n_sim: usize, seed: u64, ) -> Vec<SimulationResult> }
predict()
Population predictions without random effects (eta = 0). No simulation noise is added.
#![allow(unused)] fn main() { pub fn predict( model: &CompiledModel, population: &Population, params: &ModelParameters, ) -> Vec<PredictionResult> }
Returns: Vector of PredictionResult with population-level predictions.
Example:
#![allow(unused)] fn main() { let preds = predict(&model, &population, &model.default_params); for p in &preds { println!("ID {}, TIME {}, PRED {:.3}", p.id, p.time, p.pred); } }
Result Types
#![allow(unused)] fn main() { pub struct SimulationResult { pub sim: usize, // Replicate number (1-indexed) pub id: String, // Subject ID pub time: f64, // Observation time pub ipred: f64, // Individual prediction (no residual error) pub dv_sim: f64, // Simulated observation (with residual error) } pub struct PredictionResult { pub id: String, pub time: f64, pub pred: f64, // Population prediction (eta = 0) } }
Simulation Process
For each replicate and each subject:
- Sample random effects: \( \eta_i \sim N(0, \Omega) \) using the Cholesky factor \( L \): \( \eta = L \cdot z \), where \( z \sim N(0, I) \)
- Compute individual PK parameters via
pk_param_fn(theta, eta, covariates) - Generate predictions using the structural model
- Add residual error: \( DV = IPRED + \sqrt{V} \cdot \epsilon \), where \( \epsilon \sim N(0, 1) \) and \( V \) is the residual variance from the error model
Parsing
Model Parsing
parse_model_file()
Parse a .ferx file into a CompiledModel. Only the core model blocks are processed ([parameters], [individual_parameters], [structural_model], [error_model]).
#![allow(unused)] fn main() { pub fn parse_model_file(path: &Path) -> Result<CompiledModel, String> }
parse_model_string()
Parse a model from a string instead of a file.
#![allow(unused)] fn main() { pub fn parse_model_string(content: &str) -> Result<CompiledModel, String> }
Example:
#![allow(unused)] fn main() { let model_str = r#" [parameters] theta TVCL(0.1, 0.001, 10.0) theta TVV(10.0, 0.1, 500.0) omega ETA_CL ~ 0.09 sigma ADD_ERR ~ 1.0 [individual_parameters] CL = TVCL * exp(ETA_CL) V = TVV [structural_model] pk one_cpt_iv_bolus(cl=CL, v=V) [error_model] DV ~ additive(ADD_ERR) "#; let model = parse_model_string(model_str)?; }
parse_full_model_file()
Parse a complete model file including [fit_options] and [simulation] blocks.
#![allow(unused)] fn main() { pub fn parse_full_model_file(path: &Path) -> Result<ParsedModel, String> }
Returns a ParsedModel which contains:
#![allow(unused)] fn main() { pub struct ParsedModel { pub model: CompiledModel, pub simulation: Option<SimulationSpec>, pub fit_options: FitOptions, } }
Data Parsing
read_nonmem_csv()
Read a NONMEM-format CSV file into a Population.
#![allow(unused)] fn main() { pub fn read_nonmem_csv( path: &Path, covariate_columns: Option<&[&str]>, ) -> Result<Population, String> }
Parameters:
path: Path to the CSV filecovariate_columns: Optional list of covariate column names. IfNone, all non-standard columns are auto-detected as covariates.
Example:
#![allow(unused)] fn main() { // Auto-detect covariates let pop = read_nonmem_csv(Path::new("data.csv"), None)?; // Explicit covariate list let pop = read_nonmem_csv( Path::new("data.csv"), Some(&["WT", "CRCL", "AGE"]), )?; println!("{} subjects, {} observations", pop.subjects.len(), pop.n_obs()); }
Data Processing Details
- Column names are matched case-insensitively
- Standard NONMEM columns (
ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,II,SS) are recognized automatically - Missing values (
., empty string) are handled appropriately - Rows with
EVID=1are treated as dose events - Rows with
EVID=0andMDV=0are treated as observations - Time-constant covariates use the first non-missing value per subject
- Time-varying covariates use Last Observation Carried Forward (LOCF)
Frequently Asked Questions
Do I need to use MU-referencing in my model definitions, like in NONMEM / nlmixr2?
No. FeRx does not require MU-referencing.
In NONMEM and nlmixr2, MU-referencing is a convention in which each random effect ETA(i) is linearly associated with a single MU_i term (typically MU_i = LOG(THETA(i))), so individual parameters look like:
MU_1 = LOG(THETA(1))
CL = EXP(MU_1 + ETA(1))
This structure is required by those tools' SAEM implementations because they rely on conjugate Gibbs updates that are only valid when the MU_i → ETA(i) relationship is strictly linear (and typically on a log scale). If you deviate — for example by writing CL = THETA(1) * EXP(ETA(1)) without going through an intermediate MU_1, or by mixing multiple etas into one parameter — NONMEM SAEM will either reject the model or silently produce biased estimates.
FeRx's SAEM implementation uses Metropolis-Hastings sampling for the E-step rather than Gibbs, which does not require MU-referencing. You can write individual parameters in any form you like:
# All of these work fine in ferx:
CL = TVCL * exp(ETA_CL)
CL = TVCL + ETA_CL # additive eta
CL = TVCL * (WT/70)^0.75 * exp(ETA_CL) # with covariates
CL = TVCL * exp(ETA_CL + ETA_CL_OCC) # multiple etas
VMAX = TVVMAX * exp(ETA_VMAX)
KM = TVKM # no eta at all
The FOCE / FOCEI estimators have no MU-referencing requirement in any NLME tool — they use MAP optimization over etas regardless of parameterization. This is equally true in ferx.
Performance implication
The main tradeoff is that MH sampling is slightly less efficient per iteration than conjugate Gibbs for MU-referenced models. In practice:
- For models with a few random effects, the difference is negligible
- For models with many (>10) random effects, conjugate Gibbs would converge in fewer iterations — but in exchange you gain flexibility to write models that don't fit the MU-referenced mold
If you have a NONMEM model that uses MU-referencing and want to port it to ferx, you can drop the MU intermediate step and write the individual parameters directly — the results will be equivalent.