Introduction

FeRx-NLME is a high-performance Nonlinear Mixed Effects (NLME) modeling engine written in Rust. It is designed for population pharmacokinetic (PopPK) analysis, implementing the same statistical methodology as established tools like NONMEM, Monolix, and Pumas.

Key Features

  • FOCE/FOCEI estimation -- First-Order Conditional Estimation with optional interaction, the gold standard for PopPK
  • SAEM estimation -- Stochastic Approximation EM for robust convergence on complex models
  • Analytical PK solutions -- Built-in one- and two-compartment models (IV bolus, oral, infusion) with numerical stability guarantees
  • ODE solver -- Dormand-Prince RK45 adaptive integrator for custom kinetic models (e.g. Michaelis-Menten)
  • Automatic differentiation -- Enzyme-based reverse-mode AD for fast, exact gradients
  • Simple model DSL -- Declarative .ferx model files that read like equations
  • NONMEM-compatible data -- Reads standard NONMEM CSV datasets directly
  • Covariate support -- Time-constant and time-varying covariates with automatic detection
  • Parallel estimation -- Per-subject computations parallelized via Rayon

How It Compares

FeatureFeRxNONMEMMonolixPumas
LanguageRustFortranC++Julia
FOCE/FOCEIYesYesNoYes
SAEMYesNoYesYes
Analytical PKYesYesYesYes
ODE modelsYesYesYesYes
Auto-diffYes (Enzyme)NoNoYes (ForwardDiff)
Open sourceYesNoNoNo

Architecture Overview

FeRx uses a two-level optimization structure:

  • Outer loop: Optimizes population parameters (theta, omega, sigma) using NLopt or built-in BFGS
  • Inner loop: For each subject, finds empirical Bayes estimates (EBEs) of random effects by minimizing individual negative log-likelihood

Parameters are internally transformed for unconstrained optimization: theta and sigma are log-transformed, and omega uses Cholesky factorization to guarantee positive-definiteness.

Project Structure

src/
  types.rs          -- Core data structures
  api.rs            -- Public API (fit, simulate, predict)
  parser/           -- .ferx model file parser
  pk/               -- Analytical PK solutions
  ode/              -- ODE solver and predictions
  estimation/       -- FOCE, FOCEI, SAEM algorithms
  stats/            -- Likelihood and residual computations
  io/               -- Data reading and output writing
  ad/               -- Automatic differentiation
  bin/run_model.rs  -- CLI binary

License

FeRx-NLME is released under the MIT License. Copyright © 2026 InsightRX, Inc.

Getting Started

This section covers installing FeRx and running your first model.

  • Installation -- Setting up the Rust toolchain and building FeRx
  • Quick Start -- Running a complete PopPK analysis in under 5 minutes

Installation

FeRx requires a nightly Rust toolchain with the Enzyme LLVM plugin for automatic differentiation. As of 2026, Enzyme is not yet distributed via rustup, so a one-time plugin build is required.

Pick your platform:

  • Linux — fully supported, one-time ~30 min plugin build
  • macOS — supported with caveats
  • Windowsnot supported (see why below)

Linux

Tested on Ubuntu 22.04. Other distributions (Debian, Fedora, Arch) should work with adjustments to the package manager commands.

1. Install rustup + upstream nightly

Do not use snap's rustup — its filesystem confinement breaks on non-standard home directories (common on enterprise servers).

# Remove snap rustup if you had it:
sudo snap remove rustup

# Official installer
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"

rustup toolchain install nightly

2. Install system dependencies and matching LLVM

Check which LLVM major version nightly needs:

rustc +nightly --version --verbose | grep LLVM
# e.g. "LLVM version: 22.1.2" — major is 22

Use that major version (<MAJOR>) below.

sudo apt install -y cmake ninja-build clang libssl-dev pkg-config \
                    python3 build-essential curl git libzstd-dev

# Install matching LLVM from apt.llvm.org (Ubuntu's defaults lag behind):
wget https://apt.llvm.org/llvm.sh
chmod +x llvm.sh
sudo ./llvm.sh <MAJOR>

# Fix GPG keyring permissions if apt warns:
sudo chmod 644 /etc/apt/trusted.gpg.d/apt.llvm.org.asc
sudo apt update

sudo apt install -y llvm-<MAJOR>-dev clang-<MAJOR>

For other distros, install the matching llvm-dev + clang packages from your package manager. The LLVM major version must exactly match what rustc reports.

3. Build the Enzyme plugin

git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build

cmake -G Ninja .. \
  -DLLVM_DIR=/usr/lib/llvm-<MAJOR>/lib/cmake/llvm \
  -DENZYME_CLANG=OFF \
  -DENZYME_FLANG=OFF
ninja
# 15–30 min

4. Install the plugin into nightly's sysroot

This location is not obvious — rustc looks in lib/rustlib/<target>/lib/, not just lib/. Despite the error wording ("folder not found"), it's searching for a file:

SYSROOT=$(rustc +nightly --print sysroot)
TARGET=x86_64-unknown-linux-gnu   # or aarch64-unknown-linux-gnu on ARM

cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.so \
   $SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.so

Note the filename rewrite: LLVMEnzyme-<N>.solibEnzyme-<N>.so (with lib prefix).

5. Register the toolchain as enzyme

ferx's build system pins to a toolchain named enzyme:

rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme --version

6. Verify

rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head

Expected: error[E0601]: main function not found. That's the success signal. If you see autodiff backend not found in the sysroot, the .so is missing or in the wrong place (step 4).

Multi-user servers

For shared Linux servers, stage the built toolchain in /opt/rust-nightly so all users can share one build:

sudo mkdir -p /opt/rust-nightly
sudo cp -a ~/.rustup/toolchains/nightly-x86_64-unknown-linux-gnu/. /opt/rust-nightly/
sudo chown -R root:root /opt/rust-nightly
sudo chmod -R a+rX /opt/rust-nightly
sudo chmod a+rx /opt/rust-nightly/bin/*

Each user then links the shared toolchain into their own rustup:

# per user, once
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --default-toolchain none
rustup toolchain link enzyme /opt/rust-nightly

For R users, add to ~/.Renviron:

PATH=/opt/rust-nightly/bin:${HOME}/.cargo/bin:${PATH}
RUSTUP_TOOLCHAIN=enzyme

macOS

Supported on both Intel and Apple Silicon, with caveats around LLVM installation.

1. Install Xcode Command Line Tools

xcode-select --install

2. Install rustup + upstream nightly

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y
source "$HOME/.cargo/env"

rustup toolchain install nightly

3. Install matching LLVM via Homebrew

Check which LLVM major version nightly needs:

rustc +nightly --version --verbose | grep LLVM

Homebrew only ships a handful of LLVM versions at a time. If the version you need is on Homebrew:

brew install llvm@<MAJOR>
# Path will be /opt/homebrew/opt/llvm@<MAJOR> (Apple Silicon)
# or           /usr/local/opt/llvm@<MAJOR>     (Intel)

If Homebrew doesn't have your version, you'll need to build LLVM from source (much longer — consider sticking with a slightly older nightly that matches an available LLVM version).

4. Build the Enzyme plugin

git clone https://github.com/EnzymeAD/Enzyme /tmp/enzyme-build
cd /tmp/enzyme-build/enzyme
mkdir build && cd build

# Adjust paths for your brew prefix (/opt/homebrew or /usr/local)
cmake -G Ninja .. \
  -DLLVM_DIR=$(brew --prefix)/opt/llvm@<MAJOR>/lib/cmake/llvm \
  -DENZYME_CLANG=OFF \
  -DENZYME_FLANG=OFF
ninja

You may need to install ninja if not present: brew install ninja cmake.

5. Install the plugin into nightly's sysroot

SYSROOT=$(rustc +nightly --print sysroot)

# On Apple Silicon:
TARGET=aarch64-apple-darwin
# On Intel:
# TARGET=x86_64-apple-darwin

# Note: on macOS the shared library extension is .dylib, not .so
cp /tmp/enzyme-build/enzyme/build/Enzyme/LLVMEnzyme-<MAJOR>.dylib \
   $SYSROOT/lib/rustlib/$TARGET/lib/libEnzyme-<MAJOR>.dylib

6. Register and verify

rustup toolchain link enzyme "$(rustc +nightly --print sysroot)"
rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head
# Expect: error[E0601]: `main` function not found

macOS caveats

  • Apple Silicon (M1/M2/M3/M4): use aarch64-apple-darwin as TARGET. Intel Macs use x86_64-apple-darwin
  • System Integrity Protection (SIP): If you see code-signing errors loading the .dylib during rustc invocation, try sudo codesign --force --sign - <path-to-libEnzyme.dylib> — should be rare on dev machines
  • Nightly toolchain distribution mismatches: Apple Silicon nightlies occasionally lag x86_64 by a day or two; if LLVM version mismatches after rustup update, prefer installing a specific dated nightly

Windows

Windows is not currently supported. The blockers are:

  1. EnzymeAD plugin build on MSVC is not well-tested. The Enzyme project primarily targets Linux with LLVM clang/gcc. While Windows builds exist in theory, the toolchain integration (rustc sysroot path conventions, plugin discovery, MSVC vs MinGW linking) has known issues that haven't been worked through.

  2. Rust autodiff feature gate interactions with MSVC codegen. The -Z autodiff=Enable path has not been exercised on the x86_64-pc-windows-msvc target. Anecdotal reports from the Rust autodiff tracking issue show crashes or "backend not found" errors on Windows even when the plugin .dll is present.

  3. We haven't set up CI for Windows. Without a green CI signal we can't promise the build works.

Workarounds for Windows users

  • WSL2 (recommended): Install Ubuntu under WSL2 and follow the Linux instructions. Performance is near-native for CPU-bound workloads like model fitting.
  • Docker: Use the forthcoming ferx-nlme Docker image (coming soon) which ships with the toolchain pre-installed.
  • Remote dev: SSH into a Linux server or cloud VM.

Future Windows support

Once upstream Enzyme integration lands in rustup (tracked at rust-lang/rust autodiff tracking issue), Windows support should become straightforward since the plugin distribution problem goes away. No ETA; the feature is still experimental in rustc.

If you're a Windows developer who would like to help, a CI run against x86_64-pc-windows-msvc + contributed docs for that platform would be very welcome.


Building ferx-nlme from source

Once your Enzyme toolchain is set up:

git clone https://github.com/InsightRX/ferx-nlme
cd ferx-nlme

cargo build --release --features autodiff
# Binary at target/release/ferx

Build options

# Debug build (faster compile, slower runtime)
cargo build --features autodiff

# Quick type-check without building
cargo check --features autodiff

# Lints
cargo clippy --features autodiff

# CI build without autodiff (uses finite differences — no Enzyme needed)
cargo build --release --no-default-features --features ci

The ci feature is useful for development on machines without the full Enzyme toolchain — at the cost of much slower gradient computation.

Verify the build

cargo run --release --features autodiff --bin ferx -- examples/warfarin.ferx --simulate

Should print a successful model fit with parameter estimates.


Installing the ferx R package

Ensure the Enzyme toolchain is set up (above). Then:

devtools::install_github("InsightRX/ferx")

The R package's build is driven by its Makevars, which invokes cargo and resolves the enzyme toolchain via rustup. Both must be set up correctly in your shell/Renviron before calling install_github.

See the ferx R package README for API usage.

Cancelling a running fit (Ctrl-C)

ferx_fit() runs the estimator on a worker thread and polls for R interrupts on the main thread every ~100 ms, so Ctrl-C (or RStudio's red stop button) aborts the fit cleanly. The worker exits at the next safe checkpoint — typically within a second or two, but up to one inner-loop evaluation for heavy ODE models. The call then returns with an R error:

Error: ferx_fit: cancelled by user

Ctrl-Z (SIGTSTP) will not abort the fit — it suspends the whole R process to the shell. Use Ctrl-C instead.


Dependencies

FeRx depends on these crates (managed automatically by Cargo):

CratePurpose
nalgebraLinear algebra (matrices, Cholesky)
nloptNonlinear optimization (SLSQP, L-BFGS, MMA)
rayonParallel computation
rand, rand_distrRandom number generation (SAEM, SIR)
csvCSV data file reading
regexModel file expression parsing

The nlopt crate requires the NLopt C library. Most platforms handle this automatically; if the build fails on NLopt, install it via your system package manager:

# macOS
brew install nlopt

# Ubuntu/Debian
sudo apt install libnlopt-dev

# Fedora
sudo dnf install NLopt-devel

Troubleshooting

"error: the option 'Z' is only accepted on the nightly compiler"

Your shell (or R) is finding a non-nightly rustc. Check rustc --version and your PATH. For R, verify Sys.which("rustc") and ~/.Renviron.

"autodiff backend not found in the sysroot: failed to find a libEnzyme-<N> folder"

Despite the wording ("folder"), rustc is searching for a file. Causes:

  • Wrong directory: the .so/.dylib is in <sysroot>/lib/ instead of <sysroot>/lib/rustlib/<target>/lib/
  • LLVM version mismatch: rebuild Enzyme against the LLVM version rustc reports
  • Filename: must be libEnzyme-<MAJOR>.so (Linux) or libEnzyme-<MAJOR>.dylib (macOS), with lib prefix

"custom toolchain 'enzyme' specified in override file ... is not installed"

Run rustup toolchain link enzyme <path-to-nightly-sysroot>.

Permission issue — the user running toolchain link can't read the target path. On shared installs, run sudo chmod -R a+rX /opt/rust-nightly.

"incorrect value 'X' for unstable option 'autodiff'"

Valid autodiff values change between nightly builds. Test with:

rustc +enzyme -Z autodiff=Enable - </dev/null 2>&1 | head

If Enable is rejected, try LooseTypes, Inline, or PrintTA.

"Enzyme: cannot handle (forward) unknown intrinsic llvm.maximumnum"

Recent rustc lowers f64::max()/f64::min() to intrinsics Enzyme can't yet differentiate. This is a ferx-nlme code-level issue — AD-instrumented functions must use manual if expressions. Should not happen on released ferx-nlme versions; if it does, please file an issue.

"cargo is unavailable for the active toolchain" (info, not error)

Cargo wasn't copied into your linked toolchain. Either copy it (cp ~/.cargo/bin/cargo /opt/rust-nightly/bin/) or ignore — rustup falls back to nightly's cargo, which works.

Refreshing when upstream nightly rolls forward

If a new ferx-nlme release references stdlib items your cached toolchain doesn't have (e.g. autodiff_forward not found), rebuild /opt/rust-nightly against the current nightly, and rebuild Enzyme if the LLVM major version changed.

Quick Start

This walkthrough fits a one-compartment oral PK model to warfarin data.

1. Create a model file

Save the following as warfarin.ferx:

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  maxiter    = 300
  covariance = true

2. Prepare your data

Data must be in NONMEM CSV format. The minimum required columns are ID, TIME, and DV. Dosing records use EVID=1 with AMT:

ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,1,14.42,0,.,.,0
1,2,17.56,0,.,.,0
1,4,15.23,0,.,.,0
1,8,10.15,0,.,.,0
1,12,6.75,0,.,.,0
1,24,2.24,0,.,.,0

3. Run the fit

ferx warfarin.ferx --data warfarin.csv

4. Interpret the output

The console output shows the estimation progress and final results:

Starting FOCE estimation...
  10 subjects, 110 observations
  3 thetas, 3 etas, 1 sigmas
...
Final OFV = -280.1838

============================================================
NONLINEAR MIXED EFFECTS MODEL ESTIMATION
============================================================

Converged: YES
Estimation method: FOCE

--- Objective Function ---
OFV:  -280.1838
AIC:  -266.1838
BIC:  -247.2804

--- THETA Estimates ---
Parameter            Estimate           SE       %RSE
----------------------------------------------------
TVCL                 0.132735     0.014549       11.0
TVV                  7.694842     0.293028        3.8
TVKA                 0.757498     0.034986        4.6

--- OMEGA Estimates (variances) ---
  OMEGA(1,1) = 0.028584  (CV% = 16.9)  SE = 0.006394
  OMEGA(2,2) = 0.009613  (CV% = 9.8)   SE = 0.002165
  OMEGA(3,3) = 0.340868  (CV% = 58.4)  SE = 0.076351

--- SIGMA Estimates ---
  SIGMA(1) = 0.010638  SE = 0.000788

5. Output files

Three files are generated:

FileContents
warfarin-sdtab.csvPer-observation diagnostics (PRED, IPRED, CWRES, IWRES, ETAs)
warfarin-fit.yamlParameter estimates with standard errors in YAML format
warfarin-timing.txtWall-clock estimation time

Next Steps

Model File Reference

FeRx models are defined in .ferx files using a declarative DSL. Each file is organized into blocks, each denoted by a [block_name] header.

Block Overview

BlockRequiredPurpose
[parameters]YesDefine theta, omega, and sigma parameters
[individual_parameters]YesMap population parameters to individual PK parameters
[structural_model]YesSpecify the PK model (analytical or ODE)
[error_model]YesDefine the residual error model
[odes]If ODEODE right-hand-side equations
[fit_options]NoConfigure estimation method and optimizer
[simulation]NoDefine a simulation trial design

Minimal Example

[parameters]
  theta TVCL(0.1, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  sigma ADD_ERR ~ 1.0

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)

[structural_model]
  pk one_cpt_iv_bolus(cl=CL, v=V)

[error_model]
  DV ~ additive(ADD_ERR)

Lines beginning with # are treated as comments.

Parameters

The [parameters] block defines all model parameters: fixed effects (theta), between-subject variability (omega), and residual error (sigma).

Theta (Fixed Effects)

theta NAME(initial_value, lower_bound, upper_bound)
  • NAME: Parameter name (used in [individual_parameters] expressions)
  • initial_value: Starting value for estimation
  • lower_bound: Lower bound constraint (must be > 0; parameters are log-transformed internally)
  • upper_bound: Upper bound constraint

Example:

theta TVCL(0.134, 0.001, 10.0)
theta TVV(8.1, 0.1, 500.0)
theta TVKA(1.0, 0.01, 50.0)

Omega (Between-Subject Variability)

Diagonal omega

omega NAME ~ variance
  • NAME: Random effect name (used in [individual_parameters] as ETA_XXX)
  • variance: Initial variance estimate (diagonal element of the omega matrix)

Example:

omega ETA_CL ~ 0.07
omega ETA_V  ~ 0.02
omega ETA_KA ~ 0.40

Each variance represents the between-subject variability for that parameter. The coefficient of variation (CV%) is approximately sqrt(variance) * 100 for log-normally distributed parameters. For example, omega ETA_CL ~ 0.09 corresponds to ~30% CV.

Block omega (correlated random effects)

To estimate correlations between random effects, use block_omega:

block_omega (NAME1, NAME2, ...) = [lower_triangle_values]

The values are the lower triangle of the covariance matrix, specified row-wise. For a 2x2 block:

block_omega (ETA_CL, ETA_V) = [var_CL, cov_CL_V, var_V]

For a 3x3 block:

block_omega (ETA_CL, ETA_V, ETA_KA) = [var_CL, cov_CL_V, var_V, cov_CL_KA, cov_V_KA, var_KA]

You can mix diagonal and block omega specifications. Diagonal omegas specify uncorrelated random effects, while block omegas estimate the full covariance sub-matrix:

block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40

This estimates a 3x3 omega where ETA_CL and ETA_V are correlated (2x2 block), but ETA_KA is uncorrelated with both.

Declaration order

The order of omega and block_omega lines in the [parameters] block determines the ETA indexing throughout the model: in the omega matrix and all output. For example:

block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
omega ETA_KA ~ 0.40

produces ETA order [ETA_CL, ETA_V, ETA_KA] (indices 1, 2, 3), while:

omega ETA_KA ~ 0.40
block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]

produces [ETA_KA, ETA_CL, ETA_V] (indices 1, 2, 3). The [individual_parameters] block should list assignments in the same order for clarity, though the parameter mapping is by name, not position.

Sigma (Residual Error)

sigma NAME ~ value
  • NAME: Residual error parameter name (referenced in [error_model])
  • value: Initial value

Example:

sigma PROP_ERR ~ 0.01
sigma ADD_ERR  ~ 1.0

The interpretation of sigma depends on the error model:

Error ModelSigma Meaning
AdditiveStandard deviation of additive error
ProportionalCoefficient of proportional error
CombinedFirst sigma = proportional coefficient, second = additive SD

Complete Examples

Diagonal omega (no correlations):

[parameters]
  theta TVCL(0.134, 0.001, 10.0)
  theta TVV(8.1, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  omega ETA_CL ~ 0.07
  omega ETA_V  ~ 0.02
  omega ETA_KA ~ 0.40

  sigma PROP_ERR ~ 0.01

Block omega (correlated CL and V):

[parameters]
  theta TVCL(0.134, 0.001, 10.0)
  theta TVV(8.1, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  block_omega (ETA_CL, ETA_V) = [0.09, 0.02, 0.04]
  omega ETA_KA ~ 0.40

  sigma PROP_ERR ~ 0.01

Individual Parameters

The [individual_parameters] block defines how population parameters (theta), random effects (eta), and covariates combine to produce individual PK parameters.

Syntax

PARAM = expression

Each line assigns a PK parameter using an arithmetic expression that can reference:

  • Theta parameters -- names defined in [parameters] (e.g., TVCL, TVV)
  • Eta random effects -- names defined as omega parameters (e.g., ETA_CL, ETA_V)
  • Covariates -- column names from the data file (e.g., WT, CRCL)
  • Constants -- numeric literals (e.g., 70, 0.75)

Supported Operators and Functions

Operator/FunctionExample
+, -, *, /TVCL * WT / 70
^ (power)(WT/70)^0.75
exp()exp(ETA_CL)
log(), ln()log(TVCL)
sqrt()sqrt(WT)
abs()abs(ETA_CL)
ParenthesesTVCL * (WT/70)^0.75

Common Parameterizations

Exponential (log-normal) random effects

The standard approach for PK parameters that must be positive:

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

Allometric scaling with covariates

[individual_parameters]
  CL = TVCL * (WT/70)^0.75 * exp(ETA_CL)
  V  = TVV  * (WT/70)^1.0  * exp(ETA_V)

Estimated covariate effects

Use additional theta parameters for covariate coefficients:

[parameters]
  theta TVCL(0.134, 0.001, 10.0)
  theta THETA_WT(0.75, 0.01, 2.0)
  theta THETA_CRCL(0.5, 0.01, 2.0)

[individual_parameters]
  CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)

Covariate Detection

Any uppercase identifier in the expression that does not match a theta name or eta name is automatically treated as a covariate. The covariate value is read from the corresponding column in the data file.

For example, in CL = TVCL * (WT/70)^0.75 * exp(ETA_CL):

  • TVCL matches a theta parameter
  • ETA_CL matches an omega parameter
  • WT matches neither, so it is treated as a covariate column

PK Parameter Names

The parameter names on the left side of each assignment must map to recognized PK parameter names:

NamePK Parameter
CLClearance
V or V1Volume of distribution (central compartment)
QIntercompartmental clearance
V2Peripheral volume
KAAbsorption rate constant
FBioavailability (default 1.0 if omitted)

For ODE models, the parameter names are user-defined and passed as a flat vector to the ODE right-hand side function.

Structural Model

The [structural_model] block specifies the pharmacokinetic model used to generate predictions.

Analytical PK Models

For standard compartmental models, use the pk keyword with a model function:

pk MODEL_NAME(param=VALUE, param=VALUE, ...)

Available Models

Model FunctionCompartmentsRouteRequired Parameters
one_cpt_iv_bolus1IV boluscl, v
one_cpt_oral1Oralcl, v, ka
one_cpt_infusion1IV infusioncl, v
two_cpt_iv_bolus2IV boluscl, v1, q, v2
two_cpt_oral2Oralcl, v1, q, v2, ka
two_cpt_infusion2IV infusioncl, v1, q, v2

Examples

One-compartment oral:

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

Two-compartment IV bolus:

[structural_model]
  pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)

Two-compartment oral:

[structural_model]
  pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)

Bioavailability

For oral models, bioavailability (F) defaults to 1.0. To estimate it, define an F parameter in [individual_parameters] -- it will be automatically used by the oral PK functions.

Dose Handling

Analytical models support:

  • Bolus doses: Instantaneous input (default when RATE=0 in data)
  • Infusions: Zero-order input (when RATE>0 in data)
  • Steady-state: Pre-computed steady-state concentrations (when SS=1 and II>0 in data)
  • Dose superposition: Multiple doses are handled by summing contributions from each dose event

Numerical Stability

The analytical solutions include special handling for:

  • Near-equal rate constants: When absorption and elimination rates are similar (KA ~ k), L'Hopital's rule is used to avoid division by zero
  • Two-compartment eigenvalues: Vieta's formula is used for robust computation of alpha and beta

ODE Models

For non-standard kinetics (e.g., saturable elimination), use the ODE specification:

[structural_model]
  ode(obs_cmt=COMPARTMENT_NAME, states=[state1, state2, ...])

See ODE Models for full ODE syntax.

Error Model

The [error_model] block defines the residual error structure, specifying how observed data (DV) relates to model predictions.

Syntax

DV ~ ERROR_TYPE(SIGMA_PARAMS)

Available Error Models

Additive

DV ~ additive(SIGMA_NAME)

The residual variance is constant across all predictions:

\[ \text{Var}(DV) = \sigma^2 \]

Use when measurement error is independent of concentration (e.g., assay with fixed precision).

Proportional

DV ~ proportional(SIGMA_NAME)

The residual variance scales with the predicted value:

\[ \text{Var}(DV) = (\sigma \cdot f)^2 \]

where \( f \) is the model prediction. Use when measurement error increases with concentration (most common in PK).

Combined

DV ~ combined(SIGMA_PROP, SIGMA_ADD)

Combines proportional and additive components:

\[ \text{Var}(DV) = (\sigma_1 \cdot f)^2 + \sigma_2^2 \]

Use when both proportional and additive error sources are present. Requires two sigma parameters defined in [parameters].

Examples

Proportional error (most common):

[parameters]
  sigma PROP_ERR ~ 0.01

[error_model]
  DV ~ proportional(PROP_ERR)

Additive error:

[parameters]
  sigma ADD_ERR ~ 1.0

[error_model]
  DV ~ additive(ADD_ERR)

Combined error:

[parameters]
  sigma PROP_ERR ~ 0.1
  sigma ADD_ERR  ~ 0.5

[error_model]
  DV ~ combined(PROP_ERR, ADD_ERR)

Impact on Estimation

The error model affects:

  • Individual weighted residuals (IWRES): (DV - IPRED) / sqrt(Var)
  • Conditional weighted residuals (CWRES): Accounts for uncertainty in random effect estimates
  • Objective function value (OFV): The likelihood includes log(Var) terms, so the error model structure directly influences parameter estimates

ODE Models

For pharmacokinetic models without analytical solutions (e.g., saturable elimination, target-mediated drug disposition), FeRx provides an ODE solver.

Structural Model Declaration

[structural_model]
  ode(obs_cmt=OBSERVABLE_COMPARTMENT, states=[state1, state2, ...])
  • obs_cmt: The compartment whose concentration is observed (matched to DV)
  • states: List of state variable names (compartments)

ODE Equations

The [odes] block defines the right-hand side of the ODE system:

[odes]
  d/dt(state_name) = expression

Expressions can reference:

  • State variables by name
  • Individual parameters defined in [individual_parameters]
  • Arithmetic operators and functions (exp, log, sqrt, etc.)

Example: Michaelis-Menten Elimination

A one-compartment oral model with saturable (Michaelis-Menten) elimination:

[parameters]
  theta TVVMAX(10.0, 0.1, 1000.0)
  theta TVKM(2.0, 0.01, 100.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  omega ETA_VMAX ~ 0.09
  omega ETA_V    ~ 0.04

  sigma PROP_ERR ~ 0.1

[individual_parameters]
  VMAX = TVVMAX * exp(ETA_VMAX)
  KM   = TVKM
  V    = TVV * exp(ETA_V)
  KA   = TVKA

[structural_model]
  ode(obs_cmt=central, states=[depot, central])

[odes]
  d/dt(depot)   = -KA * depot
  d/dt(central) = KA * depot / V - VMAX * central / (KM + central)

[error_model]
  DV ~ proportional(PROP_ERR)

Solver Details

FeRx uses a Dormand-Prince RK45 adaptive solver:

SettingValue
MethodExplicit Runge-Kutta 4(5)
Absolute tolerance1e-6
Relative tolerance1e-4
Max steps10,000
Initial step size0.1
Minimum step size1e-12

The solver automatically adapts step sizes based on local error estimates.

Dose Handling

  • Bolus doses: Applied as instantaneous state changes at dose times. The dose amount is added to the target compartment: state[cmt] += amt
  • Compartment indexing: Compartments are 1-indexed in the data file (CMT=1 corresponds to the first state in the states list)
  • Multiple doses: The ODE is integrated in segments between dose events, with state discontinuities at each dose

Limitations

  • Infusion doses (RATE > 0) are not currently supported for ODE models. Use bolus approximations or analytical models for infusions
  • The observable compartment contains the amount (not concentration). Divide by volume in the ODE equations if needed
  • Steady-state (SS=1) is not directly supported for ODE models

Fit Options

The optional [fit_options] block configures the estimation method and optimizer settings.

Syntax

[fit_options]
  key = value

General Options

KeyValuesDefaultDescription
methodfoce, focei, saemfoceEstimation method
maxiterinteger500Maximum outer loop iterations
covariancetrue, falsetrueCompute covariance matrix and standard errors
optimizerslsqp, lbfgs, nlopt_lbfgs, mma, bfgsslsqpOptimization algorithm
global_searchtrue, falsefalseRun gradient-free pre-search before local optimization
global_maxevalintegerautoMax evaluations for global search
bloq_methoddrop, m3dropHow to handle rows with CENS=1. m3 enables Beal's M3 likelihood (see BLOQ example).

Estimation Methods

FOCE (default)

method = foce

First-Order Conditional Estimation. Linearizes the model around the empirical Bayes estimates. Fast and reliable for most models.

FOCEI

method = focei

FOCE with Interaction. Includes the dependence of the residual error on random effects. More accurate than FOCE when the error model depends on individual predictions, but slightly slower.

SAEM

method = saem

Stochastic Approximation EM. Uses Metropolis-Hastings sampling instead of MAP optimization for random effects. More robust to local minima; recommended for complex models with many random effects.

SAEM-Specific Options

KeyDefaultDescription
n_exploration150Phase 1 iterations (step size = 1)
n_convergence250Phase 2 iterations (step size = 1/k)
n_mh_steps3Metropolis-Hastings steps per subject per iteration
adapt_interval50Iterations between MH step-size adaptation
seed12345RNG seed for reproducibility

SIR (Sampling Importance Resampling)

SIR provides non-parametric parameter uncertainty estimates as an optional post-estimation step. Requires covariance = true.

KeyDefaultDescription
sirfalseEnable SIR uncertainty estimation
sir_samples1000Number of proposal samples (M)
sir_resamples250Number of resampled vectors (m)
sir_seed12345RNG seed for reproducibility

See SIR documentation for details.

Optimizer Choices

OptimizerDescriptionRecommended For
slsqpSequential Least Squares Programming (NLopt)General use (default)
bfgsBuilt-in BFGS quasi-NewtonWhen NLopt is unavailable
lbfgsLimited-memory BFGSLarge parameter spaces
nlopt_lbfgsNLopt L-BFGSAlternative L-BFGS
mmaMethod of Moving Asymptotes (NLopt)Constrained problems

Setting global_search = true runs a gradient-free pre-search (NLopt CRS2-LM) before the local optimizer. This helps escape local minima on challenging datasets.

The number of global evaluations is auto-scaled based on the number of parameters and observations, or can be set explicitly with global_maxeval.

Examples

Standard FOCE with defaults:

[fit_options]
  method     = foce
  maxiter    = 300
  covariance = true

FOCEI with global search:

[fit_options]
  method        = focei
  maxiter       = 500
  covariance    = true
  global_search = true

SAEM with custom settings:

[fit_options]
  method        = saem
  n_exploration = 200
  n_convergence = 300
  n_mh_steps    = 5
  seed          = 42
  covariance    = true

FOCEI with SIR uncertainty:

[fit_options]
  method        = focei
  covariance    = true
  sir           = true
  sir_samples   = 1000
  sir_resamples = 250
  sir_seed      = 42

Simulation

The optional [simulation] block defines a virtual clinical trial design for generating simulated data. This is useful for model validation and testing.

Syntax

[simulation]
  subjects = N
  dose = AMOUNT
  cmt = COMPARTMENT
  seed = SEED
  times = [t1, t2, t3, ...]
KeyDescription
subjectsNumber of virtual subjects to simulate
doseDose amount administered to each subject
cmtDosing compartment (1-indexed)
seedRandom seed for reproducible simulations
timesObservation time points

Example

[simulation]
  subjects = 100
  dose = 100
  cmt = 1
  seed = 12345
  times = [0.5, 1, 2, 4, 8, 12, 24]

This simulates 100 subjects, each receiving a dose of 100 units into compartment 1, with observations at 0.5, 1, 2, 4, 8, 12, and 24 hours.

Usage

Run a simulation-estimation study from the CLI:

ferx model.ferx --simulate

This will:

  1. Parse the model and [simulation] block
  2. Generate simulated data using the model's default parameters and random effects
  3. Fit the model to the simulated data
  4. Report parameter estimates and diagnostics

Simulation Process

  1. For each subject, random effects are sampled from \( \eta_i \sim N(0, \Omega) \)
  2. Individual parameters are computed using the [individual_parameters] equations
  3. Predictions are generated using the structural model
  4. Residual error is added: \( DV = IPRED + \epsilon \), where \( \epsilon \sim N(0, V) \)
  5. Observations below 0.001 are clipped to 0.001

Covariates in Simulation

Covariates can be specified per subject in the [simulation] block. Currently, the simulated population uses default covariate values (covariates are not randomized).

Estimation Methods

FeRx implements two families of estimation methods for nonlinear mixed effects models:

  • FOCE / FOCEI -- First-Order Conditional Estimation (with or without interaction). The workhorse of population PK, using nested optimization to find maximum likelihood estimates.

  • Gauss-Newton (BHHH) -- A fast alternative that exploits the nonlinear-least-squares structure of the FOCE objective. Converges in 10-30 iterations using the outer product of per-subject gradients as an approximate Hessian. Available as pure GN or a GN+FOCEI hybrid.

  • SAEM -- Stochastic Approximation Expectation-Maximization. Uses MCMC sampling for random effects, providing more robust convergence on complex models.

  • SIR -- Sampling Importance Resampling. An optional post-estimation step that provides non-parametric parameter uncertainty estimates (95% CIs), more robust than the asymptotic covariance matrix.

Quick Comparison

FeatureFOCE/FOCEIGauss-NewtonSAEM
Random effect estimationMAP (optimization)MAP (optimization)MCMC (sampling)
Convergence speedMedium (~50-100 evals)Fast (~10-30 iterations)Slower (~400 iterations)
Local minima robustnessCan get stuckCan get stuckMore robust
Gradient requiredYes (AD or FD)Yes (FD, per-subject)No (for E-step)
StochasticNoNoYes
Best forGeneral useFast iteration, well-conditioned modelsComplex models, many random effects

Choosing a Method

Start with FOCE for standard 1- or 2-compartment models with 2-4 random effects. It is deterministic and well-understood.

Try Gauss-Newton (gn or gn_hybrid) when you want faster convergence during model development, or when FOCE is slow to converge.

Switch to SAEM when:

  • FOCE fails to converge or produces implausible estimates
  • The model has many random effects (>4)
  • You suspect the FOCE solution is a local minimum
  • The model has complex nonlinear random effect structure

All methods produce comparable results on well-behaved models. The final OFV from SAEM and GN is computed using the FOCE approximation, so AIC/BIC values are directly comparable.

FOCE / FOCEI

First-Order Conditional Estimation (FOCE) and its interaction variant (FOCEI) are the primary estimation methods in FeRx. They use a two-level nested optimization to find maximum likelihood estimates of population parameters.

Algorithm Overview

Outer Loop (Population Parameters)

The outer loop optimizes the population parameters \( \theta \), \( \Omega \), and \( \sigma \) by minimizing the population objective function value (OFV):

\[ \text{OFV} = -2 \log L = \sum_{i=1}^{N} \text{OFV}_i \]

Parameters are internally transformed for unconstrained optimization:

  • Theta: Log-transformed (ensures positivity)
  • Omega: Cholesky factorization (ensures positive-definiteness)
  • Sigma: Log-transformed (ensures positivity)

Inner Loop (Empirical Bayes Estimates)

For each subject, the inner loop finds the empirical Bayes estimate (EBE) of the random effects \( \hat{\eta}_i \) by minimizing the individual negative log-likelihood:

\[ -\log p(\eta_i | y_i, \theta, \Omega, \sigma) = \frac{1}{2} \left[ \eta_i^T \Omega^{-1} \eta_i + \log|\Omega| + \sum_j \left( \frac{(y_{ij} - f_{ij})^2}{V_{ij}} + \log V_{ij} \right) \right] \]

The inner loop uses BFGS with automatic differentiation gradients, falling back to Nelder-Mead simplex if BFGS fails.

FOCE vs FOCEI

Standard FOCE

Uses linearized predictions around the EBEs:

\[ f_0 = f(\hat{\eta}) - H \hat{\eta} \]

where \( H \) is the Jacobian matrix \( \partial f / \partial \eta \). The per-subject objective is:

\[ \text{OFV}_i = (y - f_0)^T \tilde{R}^{-1} (y - f_0) + \log|\tilde{R}| \]

where \( \tilde{R} = H \Omega H^T + R(f_0) \).

FOCEI (Interaction)

Uses individual predictions directly without linearization:

\[ \text{OFV}_i = (y - \hat{f})^T V^{-1} (y - \hat{f}) + \hat{\eta}^T \Omega^{-1} \hat{\eta} + \log|\tilde{R}| \]

FOCEI is more accurate when the residual variance depends on the predicted value (proportional or combined error models), because it captures the interaction between random effects and residual error.

Optimizer Options

AlgorithmKeyDescription
SLSQPslsqpSequential Least Squares Programming. Handles bounds well. Default and recommended.
L-BFGSnlopt_lbfgsLimited-memory BFGS. Good for large parameter spaces.
MMAmmaMethod of Moving Asymptotes. Alternative constrained optimizer.

Built-in Algorithms

AlgorithmKeyDescription
BFGSbfgsQuasi-Newton with backtracking line search.
L-BFGSlbfgsMemory-efficient BFGS variant.

Global Search

Enable gradient-free pre-search to help escape local minima:

[fit_options]
  method        = foce
  global_search = true
  global_maxeval = 2000

The pre-search uses NLopt's CRS2-LM algorithm (Controlled Random Search) to explore the parameter space before handing off to the local optimizer. The number of evaluations auto-scales with model complexity if global_maxeval is not set.

Covariance Step

When covariance = true, FeRx computes the variance-covariance matrix of the parameter estimates using a finite-difference Hessian at the converged solution. This provides:

  • Standard errors (SE) for all parameters
  • Relative standard errors (%RSE) for assessing estimation precision
  • Omega SEs via delta method from the Cholesky parameterization

Convergence

The outer loop terminates when any of:

  • The gradient norm falls below outer_gtol (default 1e-6)
  • The maximum number of iterations (maxiter) is reached
  • The optimizer reports convergence (NLopt XtolReached or FtolReached)

The inner loop terminates when the gradient norm falls below inner_tol (default 1e-8) or inner_maxiter (default 200) iterations are reached.

Warm Starting

The inner loop warm-starts EBE estimation from the previous outer iteration's EBEs. This significantly reduces computation time, especially in later iterations when parameters change slowly.

Configuration Example

[fit_options]
  method     = focei
  maxiter    = 500
  covariance = true
  optimizer  = slsqp

Gauss-Newton (BHHH)

The Gauss-Newton optimizer is an alternative to the standard FOCE/FOCEI approach that exploits the nonlinear-least-squares structure of the FOCE objective. It uses the BHHH (Berndt-Hall-Hall-Hausman) approximation to the Hessian and typically converges in 10-30 iterations, compared to 100+ evaluations for gradient-based methods like SLSQP.

This approach mirrors NONMEM's modified Gauss-Newton algorithm.

Variants

MethodKeyDescription
Pure Gauss-NewtongnFast convergence for well-conditioned problems
GN + FOCEI hybridgn_hybridRuns GN first, then polishes with FOCEI for robustness

Algorithm Overview

BHHH Hessian Approximation

The FOCE objective has the form:

\[ \text{OFV} = 2 \sum_{i=1}^{N} \text{NLL}_i \]

The BHHH approximation uses the outer product of per-subject gradients as an approximate Hessian:

\[ H_{\text{BHHH}} = 4 \sum_{i=1}^{N} g_i , g_i^T \]

where \( g_i = \nabla_x \text{NLL}_i \) is the gradient of the negative log-likelihood for subject \( i \) with respect to the packed parameter vector. Per-subject gradients are computed via central finite differences.

This approximation is equivalent to the Gauss-Newton Hessian for log-likelihood objectives and is guaranteed positive semi-definite.

Levenberg-Marquardt Damping

To improve stability away from the minimum, the Hessian is regularized:

\[ H_{\text{LM}} = H_{\text{BHHH}} + \lambda , \text{diag}(H_{\text{BHHH}}) \]

The damping factor \( \lambda \) is adapted during optimization:

  • On successful step (OFV decreases): \( \lambda \leftarrow 0.3 , \lambda \) (minimum \( 10^{-6} \))
  • On rejected step (OFV increases): \( \lambda \leftarrow 10 , \lambda \)
  • If \( \lambda > 10^6 \), the algorithm stops

The Newton step is obtained by solving:

\[ H_{\text{LM}} , \delta = -\nabla \text{OFV} \]

via Cholesky decomposition. If the Cholesky fails (singular Hessian), \( \lambda \) is increased and the iteration is retried.

Each step uses backtracking line search (up to 15 halvings). At each trial point, EBEs are re-estimated via the inner loop (warm-started from the previous EBEs), ensuring the FOCE objective is evaluated consistently.

Convergence

The algorithm converges when:

  • Relative OFV change \( < 10^{-6} \) for at least 3 iterations, or
  • Maximum iterations are reached

Hybrid Mode (GN + FOCEI)

When method = gn_hybrid, the algorithm runs in two phases:

  1. GN phase: Run Gauss-Newton iterations to quickly find the basin of the minimum
  2. FOCEI polish: Run up to 100 iterations of standard FOCEI optimization (via NLopt SLSQP) warm-started from the GN result

The final result is whichever phase produced the lower OFV. This combines the fast convergence of Gauss-Newton with the refined accuracy of FOCEI.

Configuration

[fit_options]
  method     = gn          # or gn_hybrid
  maxiter    = 100         # max GN iterations
  covariance = true

The initial Levenberg-Marquardt damping factor defaults to 0.01 and adapts automatically.

When to Use

Use gn when:

  • You want fast convergence and the model is well-conditioned
  • You are iterating on model development and need quick feedback

Use gn_hybrid when:

  • You want robustness against local minima
  • The model is complex (many parameters or random effects)
  • You want the speed of GN with a safety net

Stick with foce/focei when:

  • The model has convergence issues with GN (rare)
  • You need exact compatibility with standard FOCE/FOCEI results

SAEM

Stochastic Approximation Expectation-Maximization (SAEM) is an alternative estimation method that uses MCMC sampling for random effects instead of MAP optimization. It is more robust to local minima and can handle complex random effect structures.

Algorithm Overview

SAEM replaces the deterministic inner loop of FOCE with stochastic sampling, following the Monolix convention with a two-phase step-size schedule.

References

  • Delyon, Lavielle, Moulines (1999). Convergence of a stochastic approximation version of the EM algorithm. Annals of Statistics, 94--128.
  • Kuhn & Lavielle (2004). Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: Probability and Statistics 8:115--131.

Two-Phase Schedule

Phase 1: Exploration (iterations 1 to K1)

Step size \( \gamma_k = 1 \). The algorithm explores the parameter space rapidly, with the sufficient statistics being fully replaced each iteration. This allows fast movement toward the basin of the MLE.

Default: 150 iterations.

Phase 2: Convergence (iterations K1+1 to K1+K2)

Step size \( \gamma_k = 1/(k - K_1) \). The algorithm performs a decreasing-weight average, which guarantees almost-sure convergence to the MLE under regularity conditions.

Default: 250 iterations.

Per-Iteration Steps

Each SAEM iteration consists of:

1. E-Step: Metropolis-Hastings Sampling

For each subject, run n_mh_steps Metropolis-Hastings iterations to sample from the conditional distribution of random effects:

\[ p(\eta_i | y_i, \theta, \Omega, \sigma) \]

Proposal: \( \eta_{\text{prop}} = \eta_{\text{current}} + \delta_i \cdot L \cdot z \), where \( z \sim N(0, I) \) and \( L = \text{chol}(\Omega) \).

Acceptance: Accept with probability \( \min(1, \exp(\text{NLL}{\text{current}} - \text{NLL}{\text{prop}})) \).

The MH sampling is parallelized across subjects using Rayon.

2. Stochastic Approximation Update

Update the sufficient statistic for \( \Omega \):

\[ S_2 \leftarrow (1 - \gamma_k) \cdot S_2 + \gamma_k \cdot \frac{1}{N} \sum_{i=1}^{N} \eta_i \eta_i^T \]

3. M-Step for Omega (Closed Form)

\[ \Omega_k = S_2 \]

4. M-Step for Theta and Sigma (Optimization)

Minimize the conditional observation negative log-likelihood with ETAs held fixed:

\[ \sum_{i=1}^{N} \sum_{j=1}^{n_i} \left[ \frac{1}{2} \log V_{ij} + \frac{1}{2} \frac{(y_{ij} - f_{ij})^2}{V_{ij}} \right] \]

This is optimized over \( \theta \) and \( \sigma \) in log-space using NLopt SLSQP with finite-difference gradients.

5. Adaptive MH Step Sizes

Every adapt_interval iterations, the per-subject step sizes \( \delta_i \) are adjusted:

  • If acceptance rate > 40%: increase \( \delta_i \) by 10% (up to 5.0)
  • If acceptance rate < 40%: decrease \( \delta_i \) by 10% (down to 0.01)

This targets an acceptance rate around 40%, balancing exploration and mixing.

Post-SAEM Finalization

After the SAEM iterations complete:

  1. EBE Refinement: Run the standard FOCE inner loop (BFGS optimization) warm-started from the SAEM ETAs to obtain final empirical Bayes estimates
  2. FOCE OFV: Compute the objective function using the FOCE/Laplace approximation, so AIC and BIC are directly comparable with FOCE results
  3. Covariance Step: Optionally compute standard errors via finite-difference Hessian (same method as FOCE)
  4. Diagnostics: Compute PRED, IPRED, CWRES, IWRES for each subject

Configuration

[fit_options]
  method        = saem
  n_exploration = 150      # Phase 1 iterations
  n_convergence = 250      # Phase 2 iterations
  n_mh_steps    = 3        # MH steps per subject per iteration
  adapt_interval = 50      # Step-size adaptation frequency
  seed          = 12345    # RNG seed for reproducibility
  covariance    = true     # Compute standard errors

Tuning Guide

Not Converging

  • Increase n_exploration (e.g., 300) to give more time for basin finding
  • Increase n_convergence (e.g., 500) for a longer averaging window
  • Increase n_mh_steps (e.g., 5-10) for better mixing in the E-step

Slow Convergence

  • Decrease n_exploration and n_convergence if parameters stabilize early
  • Use adapt_interval = 25 for faster step-size adaptation

Reproducibility

  • Always set seed for reproducible results
  • Different seeds will produce slightly different estimates due to the stochastic nature of the algorithm

Output

The SAEM iteration progress is printed to stderr:

SAEM: 10 subjects, 3 ETAs, 400 total iter (150 explore + 250 converge)
  SAEM iter    1/400 [explore] gamma=1.000  condNLL=95.244
  SAEM iter   50/400 [explore] gamma=1.000  condNLL=56.705
  SAEM iter  150/400 [explore] gamma=1.000  condNLL=46.071
  SAEM iter  200/400 [converge] gamma=0.020  condNLL=36.799
  SAEM iter  400/400 [converge] gamma=0.004  condNLL=38.096
SAEM iterations complete. Computing final EBEs and OFV...

The condNLL is the conditional observation negative log-likelihood (not the final OFV). It should generally decrease during the exploration phase and stabilize during convergence.

Sampling Importance Resampling (SIR)

SIR is an optional post-estimation step that provides non-parametric parameter uncertainty estimates. It produces 95% confidence intervals that are more robust than the asymptotic covariance matrix, particularly for models with:

  • Non-normal parameter distributions
  • Boundary estimates (parameters near constraints)
  • Small datasets where asymptotic assumptions may not hold

How It Works

SIR uses the maximum likelihood estimates and their covariance matrix as a proposal distribution, then reweights samples based on the actual likelihood:

  1. Sample: Draw M parameter vectors from a multivariate normal distribution centered on the ML estimates, using the estimation covariance matrix
  2. Importance weighting: For each sample, compute the objective function value (OFV) and calculate an importance weight based on the ratio of the true likelihood to the proposal density
  3. Resample: Draw m vectors (with replacement) proportional to the importance weights

The resampled vectors approximate the true parameter uncertainty distribution. Confidence intervals are derived from their empirical percentiles.

Enabling SIR

Add sir = true to the [fit_options] block. The covariance step must also be enabled (it provides the proposal distribution):

[fit_options]
  method     = focei
  covariance = true
  sir        = true

Options

KeyDefaultDescription
sirfalseEnable/disable SIR
sir_samples1000Number of proposal samples (M). Higher values give more reliable weights but take longer
sir_resamples250Number of resampled vectors (m). Must be less than sir_samples
sir_seed12345RNG seed for reproducibility

Output

SIR adds the following to the estimation output:

  • 95% CI for each theta, omega, and sigma parameter (2.5th and 97.5th percentiles)
  • Effective sample size (ESS): a diagnostic indicating how well the proposal distribution matches the true uncertainty. ESS close to M indicates a good match; ESS much less than m suggests the proposal is a poor fit

Diagnostics

The effective sample size (ESS) is the primary diagnostic:

  • ESS > m (resamples): excellent — the proposal distribution is well-matched
  • ESS between 100 and m: adequate for most purposes
  • ESS < 100: the proposal may be a poor fit; consider a different estimation method or increasing sir_samples

Computational Cost

SIR evaluates the inner loop (EBE optimization) for each of the M proposal samples. With the default M=1000, this is roughly 3-10x the cost of the estimation step itself. The computation is parallelized across samples and warm-started from the ML EBEs to minimize runtime.

The resampling step itself is negligible.

Example

[fit_options]
  method        = focei
  covariance    = true
  sir           = true
  sir_samples   = 1000
  sir_resamples = 250
  sir_seed      = 42

Reference

Dosne A-G, Bergstrand M, Karlsson MO. "Improving the estimation of parameter uncertainty distributions in nonlinear mixed effects models using sampling importance resampling." J Pharmacokinet Pharmacodyn. 2017;44(6):539-562. doi:10.1007/s10928-017-9542-0

Data Format

FeRx reads data in NONMEM-compatible CSV format. This is the standard format used across population PK tools.

Required Columns

ColumnTypeDescription
IDstring/numericSubject identifier
TIMEnumericTime relative to first dose
DVnumericDependent variable (observed concentration)

Optional Standard Columns

ColumnTypeDefaultDescription
EVIDinteger0Event ID: 0 = observation, 1 = dose, 4 = reset + dose
AMTnumeric0Dose amount (only for EVID=1 or EVID=4)
CMTinteger1Compartment number (1-indexed)
RATEnumeric0Infusion rate. 0 = bolus dose
MDVinteger0Missing DV flag. 1 = DV should be ignored
IInumeric0Interdose interval for repeated dosing
SSinteger0Steady-state flag. 1 = assume steady state
CENSinteger0Censoring flag. 1 = observation is below LLOQ; DV carries the LLOQ value. Paired with bloq_method = m3 in [fit_options] to enable likelihood-based handling — see BLOQ example.

Covariate Columns

Any column not in the standard set above is automatically treated as a covariate. Covariate values are:

  • Time-constant: The first non-missing value for each subject is used
  • Time-varying: If values change over time for a subject, Last Observation Carried Forward (LOCF) is applied

Covariate names in the data file are matched case-insensitively to names used in [individual_parameters] expressions.

Event Types (EVID)

EVIDMeaning
0Observation record. DV is used for estimation.
1Dosing record. AMT is administered to compartment CMT.
4Reset and dose. All compartment amounts are reset to zero before dosing.

Example Dataset

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,WT,CRCL
1,0,.,1,100,1,0,1,70,95
1,0.5,9.49,0,.,.,.,0,70,95
1,1,14.42,0,.,.,.,0,70,95
1,2,17.56,0,.,.,.,0,70,95
1,4,15.23,0,.,.,.,0,70,95
1,8,10.15,0,.,.,.,0,70,95
2,0,.,1,150,1,0,1,85,110
2,0.5,14.2,0,.,.,.,0,85,110
2,1,21.3,0,.,.,.,0,85,110

Key points:

  • Dose records (EVID=1) have MDV=1 and DV=. (missing)
  • Observation records (EVID=0) have MDV=0 and a valid DV
  • Covariates (WT, CRCL) are included as extra columns
  • Missing values can be represented as . or left empty

Infusion Doses

For IV infusions, set RATE to the infusion rate (amount per time unit):

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV
1,0,.,1,500,1,50,1

This administers 500 units at a rate of 50 units/hour (duration = 10 hours).

Steady-State Dosing

For steady-state simulations, set SS=1 and II to the dosing interval:

ID,TIME,DV,EVID,AMT,CMT,SS,II,MDV
1,0,.,1,100,1,1,12,1
1,0.5,25.3,0,.,.,.,.,0

This assumes the subject has reached steady state with 100 units every 12 hours before the observation at TIME=0.5.

Multiple Doses

Multiple doses are supported as separate rows:

ID,TIME,DV,EVID,AMT,CMT,MDV
1,0,.,1,100,1,1
1,0.5,9.49,0,.,.,0
1,12,.,1,100,1,1
1,12.5,15.2,0,.,.,0
1,24,.,1,100,1,1
1,24.5,18.1,0,.,.,0

Column Name Case

Column names are case-insensitive. ID, Id, and id are all recognized. Covariate columns preserve their case as declared in the CSV header.

CLI Reference

The ferx command-line tool runs population PK estimation from model files and data.

Usage

ferx <model.ferx> --data <data.csv>
ferx <model.ferx> --simulate

Commands

Fit with Data

ferx model.ferx --data data.csv

Parses the model file, reads the data, and runs the estimation method specified in [fit_options] (defaults to FOCE).

Simulate and Fit

ferx model.ferx --simulate

Parses the model file, generates simulated data from the [simulation] block, and fits the model to the simulated data. Requires a [simulation] block in the model file.

Output Files

Three files are generated, named after the model file:

FileContents
{model}-sdtab.csvPer-observation diagnostics
{model}-fit.yamlParameter estimates and standard errors
{model}-timing.txtWall-clock estimation time

See Output Files for detailed format descriptions.

Console Output

Progress

The estimation progress is printed to stderr, including:

  • Model and data summary (subjects, observations, parameters)
  • Optimizer iterations with OFV values (FOCE) or condNLL values (SAEM)
  • Covariance step status
  • Final parameter table

Result Summary

A brief summary is printed to stdout:

Fit completed!
OFV: -280.1838
Elapsed: 0.496s
  TVCL = 0.132735
  TVV = 7.694842
  TVKA = 0.757498

Exit Codes

CodeMeaning
0Success
1Error (parse failure, data error, convergence failure)

Examples

# One-compartment oral warfarin model
ferx examples/warfarin.ferx --data data/warfarin.csv

# Two-compartment IV with FOCE
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv

# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv

# Simulation-estimation study
ferx examples/warfarin.ferx --simulate

Building

# Build the ferx binary
cargo build --release --features autodiff

# Run directly via cargo
cargo run --release --features autodiff --bin ferx -- model.ferx --data data.csv

Output Files

Each model run produces three output files.

sdtab CSV ({model}-sdtab.csv)

A CSV file with per-observation diagnostics, one row per observation per subject.

Columns

ColumnDescription
IDSubject identifier
TIMEObservation time
DVObserved value
PREDPopulation prediction (eta = 0)
IPREDIndividual prediction (eta = EBE)
CWRESConditional weighted residual
IWRESIndividual weighted residual
ETA1, ETA2, ...Empirical Bayes estimates of random effects

Residual Definitions

IWRES (Individual Weighted Residual): \[ \text{IWRES}_j = \frac{y_j - \text{IPRED}_j}{\sqrt{V_j}} \] where \( V_j \) is the residual variance evaluated at the individual prediction.

CWRES (Conditional Weighted Residual): \[ \text{CWRES}j = \frac{y_j - f{0,j}}{\sqrt{\tilde{R}_{jj}}} \] where \( f_0 = f(\hat{\eta}) - H\hat{\eta} \) is the linearized population prediction and \( \tilde{R} = H\Omega H^T + R \) is the conditional variance.

Example

ID,TIME,DV,PRED,IPRED,CWRES,IWRES,ETA1,ETA2,ETA3
1,0.5,9.49,10.12,9.55,-0.23,-0.06,0.15,-0.08,0.32
1,1.0,14.42,14.87,14.35,0.18,0.05,0.15,-0.08,0.32

Fit YAML ({model}-fit.yaml)

A YAML file containing parameter estimates, standard errors, and model diagnostics.

Structure

model:
  converged: true
  method: FOCE
objective_function:
  ofv: -280.1838
  aic: -266.1838
  bic: -247.2804
data:
  n_subjects: 10
  n_observations: 110
  n_parameters: 7
theta:
  TVCL:
    estimate: 0.132735
    se: 0.014549
    rse_pct: 11.0
  TVV:
    estimate: 7.694842
    se: 0.293028
    rse_pct: 3.8
omega:
  omega_11:
    variance: 0.028584
    cv_pct: 16.9
    se: 0.006394
  omega_22:
    variance: 0.009613
    cv_pct: 9.8
    se: 0.002165
sigma:
  sigma_1:
    estimate: 0.010638
    se: 0.000788

Key Fields

  • ofv: Objective Function Value (-2 log-likelihood)
  • aic: Akaike Information Criterion (OFV + 2p)
  • bic: Bayesian Information Criterion (OFV + p * ln(n))
  • se: Standard error from the covariance step
  • rse_pct: Relative standard error as percentage (SE/estimate * 100)
  • cv_pct: Coefficient of variation for omega (sqrt(variance) * 100)

Timing File ({model}-timing.txt)

A single-line text file with the wall-clock estimation time:

elapsed_seconds=0.496000

This measures only the estimation step (not parsing or data reading).

Examples

FeRx includes several example models in the examples/ directory with corresponding datasets in data/.

ExampleModelRouteFeatures
One-Compartment Oral1-cptOralBasic PopPK, FOCE and SAEM
Two-Compartment IV2-cptIV bolusMulti-compartment dynamics
Covariates2-cptOralWeight and renal function effects
ODE Model1-cptOralMichaelis-Menten elimination, ODE solver
BLOQ (M3 method)1-cptOralLikelihood-based handling of censored observations

Running Examples

# One-compartment oral (warfarin)
ferx examples/warfarin.ferx --data data/warfarin.csv

# Two-compartment IV
ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv

# Covariates model
ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv

# ODE model (Michaelis-Menten)
ferx examples/mm_oral.ferx --data data/mm_oral.csv

# SAEM estimation
ferx examples/warfarin_saem.ferx --data data/warfarin.csv

# BLOQ (M3 method)
ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv

One-Compartment Oral (Warfarin)

This example fits a one-compartment oral pharmacokinetic model to warfarin concentration data.

Model File (warfarin.ferx)

# One-compartment oral PK model (warfarin)

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  maxiter    = 300
  covariance = true

Model Description

  • Structure: One-compartment model with first-order absorption (KA) and first-order elimination (CL/V)
  • Random effects: Log-normal on CL, V, and KA (exponential eta model)
  • Error model: Proportional residual error
  • Parameters:
    • TVCL: Typical value of clearance (L/h)
    • TVV: Typical value of volume of distribution (L)
    • TVKA: Typical value of absorption rate constant (1/h)

Running

ferx examples/warfarin.ferx --data data/warfarin.csv

Expected Results

--- THETA Estimates ---
TVCL                 0.132735
TVV                  7.694842
TVKA                 0.757498

--- OMEGA Estimates (variances) ---
OMEGA(1,1) = 0.028584  (CV% = 16.9)
OMEGA(2,2) = 0.009613  (CV% = 9.8)
OMEGA(3,3) = 0.340868  (CV% = 58.4)

--- SIGMA Estimates ---
SIGMA(1) = 0.010638

SAEM Variant

The same model can be run with SAEM by changing the [fit_options]:

[fit_options]
  method        = saem
  n_exploration = 150
  n_convergence = 250
  n_mh_steps    = 3
  seed          = 12345
  covariance    = true
ferx examples/warfarin_saem.ferx --data data/warfarin.csv

Two-Compartment IV Bolus

This example fits a two-compartment IV bolus model with four random effects.

Model File (two_cpt_iv.ferx)

# Two-compartment IV bolus PK model

[parameters]
  theta TVCL(5.0, 0.01, 100.0)
  theta TVV1(50.0, 0.1, 1000.0)
  theta TVQ(10.0, 0.01, 200.0)
  theta TVV2(100.0, 0.1, 5000.0)

  omega ETA_CL ~ 0.09
  omega ETA_V1 ~ 0.04
  omega ETA_Q  ~ 0.04
  omega ETA_V2 ~ 0.09

  sigma PROP_ERR ~ 0.04

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V1 = TVV1 * exp(ETA_V1)
  Q  = TVQ  * exp(ETA_Q)
  V2 = TVV2 * exp(ETA_V2)

[structural_model]
  pk two_cpt_iv_bolus(cl=CL, v1=V1, q=Q, v2=V2)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = foce
  maxiter    = 500
  covariance = true

Model Description

  • Structure: Two-compartment model with central (V1) and peripheral (V2) compartments connected by intercompartmental clearance (Q)
  • Route: Intravenous bolus (dose goes directly into central compartment)
  • Random effects: Log-normal on all four PK parameters
  • Parameters:
    • CL: Systemic clearance (L/h)
    • V1: Central volume of distribution (L)
    • Q: Intercompartmental clearance (L/h)
    • V2: Peripheral volume of distribution (L)

The bi-exponential concentration profile is characterized by a rapid distribution phase (alpha) followed by a slower elimination phase (beta).

Running

ferx examples/two_cpt_iv.ferx --data data/two_cpt_iv.csv

Notes

  • Two-compartment models have more parameters and may need more iterations to converge
  • The global_search = true option can help if convergence is difficult
  • Consider reducing random effects (e.g., fixing Q or V2 variability) if the model is overparameterized

Covariate Model

This example demonstrates a two-compartment oral model with body weight (WT) and creatinine clearance (CRCL) as covariates on clearance.

Model File (two_cpt_oral_cov.ferx)

# Two-compartment oral PK model with covariates

[parameters]
  theta TVCL(5.0, 0.01, 100.0)
  theta TVV1(50.0, 0.1, 1000.0)
  theta TVQ(10.0, 0.01, 200.0)
  theta TVV2(100.0, 0.1, 5000.0)
  theta TVKA(1.0, 0.01, 50.0)
  theta THETA_WT(0.75, 0.01, 2.0)
  theta THETA_CRCL(0.5, 0.01, 2.0)

  omega ETA_CL ~ 0.09
  omega ETA_V1 ~ 0.04
  omega ETA_Q  ~ 0.04
  omega ETA_V2 ~ 0.09
  omega ETA_KA ~ 0.25

  sigma PROP_ERR ~ 0.04

[individual_parameters]
  CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
  V1 = TVV1 * exp(ETA_V1)
  Q  = TVQ  * exp(ETA_Q)
  V2 = TVV2 * exp(ETA_V2)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk two_cpt_oral(cl=CL, v1=V1, q=Q, v2=V2, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = focei
  maxiter    = 500
  covariance = true

Covariate Effects

The clearance equation includes two covariate effects:

CL = TVCL * (WT/70)^THETA_WT * (CRCL/100)^THETA_CRCL * exp(ETA_CL)
  • (WT/70)^THETA_WT: Allometric scaling of clearance with body weight, centered at 70 kg. THETA_WT is estimated (expected ~0.75 for CL).
  • (CRCL/100)^THETA_CRCL: Renal function effect on clearance, centered at 100 mL/min. THETA_CRCL is estimated.

Data Requirements

The dataset must include WT and CRCL columns:

ID,TIME,DV,EVID,AMT,CMT,MDV,WT,CRCL
1,0,.,1,100,1,1,72.5,105
1,0.5,12.3,0,.,.,0,72.5,105
1,1.0,18.7,0,.,.,0,72.5,105

Covariate columns are automatically detected -- any column not in the standard NONMEM set is treated as a covariate. The names in the data file must match those used in [individual_parameters] (case-insensitive).

Running

ferx examples/two_cpt_oral_cov.ferx --data data/two_cpt_oral_cov.csv

Notes

  • FOCEI is used here because the proportional error model creates an interaction between random effects and residual error
  • Covariate centering (dividing by 70 for weight, 100 for CRCL) improves numerical stability and makes the typical value (TVCL) interpretable as the clearance for a 70 kg patient with CRCL of 100 mL/min
  • The estimated covariate exponents (THETA_WT, THETA_CRCL) have standard errors that can be used to test significance

ODE Model (Michaelis-Menten)

This example demonstrates a one-compartment oral model with saturable (Michaelis-Menten) elimination, solved using the ODE integrator.

Model File (mm_oral.ferx)

# One-compartment oral model with Michaelis-Menten elimination

[parameters]
  theta TVVMAX(10.0, 0.1, 1000.0)
  theta TVKM(2.0, 0.01, 100.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.0, 0.01, 50.0)

  omega ETA_VMAX ~ 0.09
  omega ETA_V    ~ 0.04

  sigma PROP_ERR ~ 0.1

[individual_parameters]
  VMAX = TVVMAX * exp(ETA_VMAX)
  KM   = TVKM
  V    = TVV * exp(ETA_V)
  KA   = TVKA

[structural_model]
  ode(obs_cmt=central, states=[depot, central])

[odes]
  d/dt(depot)   = -KA * depot
  d/dt(central) = KA * depot / V - VMAX * central / (KM + central)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method     = focei
  maxiter    = 500
  covariance = true

Model Description

Why Use ODEs

Michaelis-Menten (saturable) elimination cannot be described by a simple exponential function. At high concentrations, elimination approaches a maximum rate (VMAX), leading to nonlinear pharmacokinetics. This requires numerical ODE integration.

ODE System

d/dt(depot)   = -KA * depot
d/dt(central) = KA * depot / V - VMAX * central / (KM + central)
  • depot: Absorption compartment. First-order absorption at rate KA.
  • central: Central compartment. Drug enters from depot and is eliminated by Michaelis-Menten kinetics.

Parameters

  • VMAX: Maximum elimination rate (amount/time)
  • KM: Michaelis-Menten constant (concentration at half-maximal rate)
  • V: Volume of distribution
  • KA: First-order absorption rate constant

Note that only VMAX and V have random effects. KM and KA are treated as population parameters (no between-subject variability).

Structural Model Declaration

[structural_model]
  ode(obs_cmt=central, states=[depot, central])
  • obs_cmt=central: The central compartment is observed (matched to DV in the data)
  • states=[depot, central]: Two state variables (compartments)

Running

ferx examples/mm_oral.ferx --data data/mm_oral.csv

Notes

  • ODE models are slower than analytical models because each prediction requires numerical integration
  • The ODE solver uses adaptive step-size control (Dormand-Prince RK45) for accuracy
  • For standard first-order kinetics, use the analytical one_cpt_oral model instead -- it is much faster
  • The central compartment in this model contains the drug amount. If you want concentration as the observed variable, divide by V in the ODE equations (as done here with KA * depot / V)

BLOQ (M3 method)

This example shows how to fit a model to data that contains observations below the assay lower limit of quantification (BLOQ) using Beal's M3 method. Instead of dropping BLOQ rows — which biases terminal-phase parameter estimates — each censored observation contributes P(y < LLOQ | θ, η) = Φ((LLOQ − f)/√V) to the likelihood.

Dataset (data/warfarin-bloq.csv)

This is the warfarin dataset with an added CENS column. Ten late-time observations that originally fell below an assay LLOQ of 2.0 µg/mL have been marked CENS=1 and their DV cell set to the LLOQ value:

ID,TIME,DV,EVID,AMT,CMT,RATE,MDV,CENS
1,0,.,1,100,1,0,1,0
1,0.5,5.3653,0,.,1,0,0,0
...
1,96,2.5019,0,.,1,0,0,0
1,120,2,0,.,1,0,0,1

NONMEM CENS convention: when CENS=1, the row is censored and DV carries the LLOQ value — not the true (unobserved) concentration. Rows with no CENS column, or CENS=0, are treated as ordinary quantified observations.

Model File (examples/warfarin-bloq.ferx)

# One-compartment oral PK model (warfarin) with M3 BLOQ likelihood.

[parameters]
  theta TVCL(0.2, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  theta TVKA(1.5, 0.01, 50.0)

  omega ETA_CL ~ 0.09
  omega ETA_V  ~ 0.04
  omega ETA_KA ~ 0.30

  sigma PROP_ERR ~ 0.02

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV  * exp(ETA_V)
  KA = TVKA * exp(ETA_KA)

[structural_model]
  pk one_cpt_oral(cl=CL, v=V, ka=KA)

[error_model]
  DV ~ proportional(PROP_ERR)

[fit_options]
  method       = focei
  maxiter      = 300
  covariance   = true
  bloq_method  = m3

The only two lines that differ from a plain fit are bloq_method = m3 in [fit_options] and the added CENS column in the CSV. Nothing else in the model changes.

Running

ferx examples/warfarin-bloq.ferx --data data/warfarin-bloq.csv

Expected Results

--- Objective Function ---
OFV:  -213.9104

--- THETA Estimates ---
TVCL                 0.126263
TVV                  7.629401
TVKA                 1.083302

--- OMEGA Estimates ---
  OMEGA(1,1) = 0.031381  (CV% = 17.7)
  OMEGA(2,2) = 0.009723  (CV% = 9.9)
  OMEGA(3,3) = 0.419327  (CV% = 64.8)

--- SIGMA Estimates ---
  SIGMA(1) = 0.010766

The diagnostic table (warfarin-bloq-sdtab.csv) gains a CENS column, and the IWRES / CWRES cells for censored rows are written as empty (a weighted Gaussian residual is undefined when the observed value is censored).

Method Notes

  • Activation requires both pieces: the CENS column in the data file and bloq_method = m3 in [fit_options]. Without the option, CENS=1 rows are treated as ordinary observations at the LLOQ value, which biases the fit.
  • FOCE is auto-promoted to FOCEI on affected subjects. Mixing linearized Gaussian residuals with non-linearized log Φ terms produces inconsistent OFVs near the LLOQ boundary, so when method = foce and a subject has any CENS=1 row, that subject is evaluated with η-interaction. A notice is written to FitResult.warnings; set method = focei explicitly to silence it.
  • Gauss-Newton caveat. With method = gn or gn_hybrid, the BHHH information-matrix approximation degrades as the BLOQ fraction grows (each censored row carries less Fisher information than its Gaussian counterpart). A warning is emitted; for >20% censoring prefer focei.
  • SAEM optimizes θ/σ with M3 in the M-step directly, so no special handling is required beyond setting bloq_method = m3.

Rust API

FeRx-NLME can be used as a Rust library for embedding population PK estimation in your own applications.

Adding as a Dependency

[dependencies]
ferx-nlme = { path = "../ferx-nlme", features = ["autodiff"] }

Quick Example

use ferx_nlme::*;
use std::path::Path;

fn main() -> Result<(), String> {
    // Parse model and data
    let parsed = parse_full_model_file(Path::new("model.ferx"))?;
    let population = read_nonmem_csv(Path::new("data.csv"), None)?;

    // Build initial parameters and options
    let (init_params, options) = build_fit_inputs(&parsed)?;

    // Run estimation
    let result = fit(&parsed.model, &population, &init_params, &options)?;

    // Access results
    println!("OFV: {}", result.ofv);
    for (name, val) in result.theta_names.iter().zip(result.theta.iter()) {
        println!("  {} = {:.6}", name, val);
    }

    Ok(())
}

API Sections

  • Core Types -- CompiledModel, Population, FitResult, FitOptions, etc.
  • Fitting Functions -- fit(), fit_from_files(), run_model_with_data()
  • Simulation -- simulate(), simulate_with_seed(), predict()
  • Parsing -- parse_model_file(), parse_full_model_file(), read_nonmem_csv()

Core Types

The main types exported by ferx_nlme. All types are available via use ferx_nlme::*.

CompiledModel

A parsed and compiled model ready for estimation. Created by parse_model_file() or parse_full_model_file().

#![allow(unused)]
fn main() {
pub struct CompiledModel {
    pub name: String,
    pub pk_model: PkModel,
    pub error_model: ErrorModel,
    pub pk_param_fn: PkParamFn,     // (theta, eta, covariates) -> PkParams
    pub n_theta: usize,
    pub n_eta: usize,
    pub n_epsilon: usize,
    pub theta_names: Vec<String>,
    pub eta_names: Vec<String>,
    pub default_params: ModelParameters,
    pub tv_fn: Option<...>,          // Typical values function (for AD)
    pub pk_indices: Vec<usize>,      // Maps eta index -> PK parameter index
    pub ode_spec: Option<OdeSpec>,   // ODE specification (if ODE model)
}
}

Key fields:

  • pk_param_fn: Closure that maps (theta, eta, covariates) to PK parameters. Generated by the parser from [individual_parameters].
  • default_params: Initial parameter values from [parameters] block.
  • ode_spec: Present only for ODE models; None for analytical models.

Population and Subject

#![allow(unused)]
fn main() {
pub struct Population {
    pub subjects: Vec<Subject>,
    pub covariate_names: Vec<String>,
    pub dv_column: String,
}

pub struct Subject {
    pub id: String,
    pub doses: Vec<DoseEvent>,
    pub obs_times: Vec<f64>,
    pub observations: Vec<f64>,
    pub obs_cmts: Vec<usize>,
    pub covariates: HashMap<String, f64>,      // Time-constant
    pub tvcov: HashMap<String, Vec<f64>>,       // Time-varying (LOCF)
}
}

ModelParameters

#![allow(unused)]
fn main() {
pub struct ModelParameters {
    pub theta: Vec<f64>,
    pub theta_names: Vec<String>,
    pub theta_lower: Vec<f64>,
    pub theta_upper: Vec<f64>,
    pub omega: OmegaMatrix,
    pub sigma: SigmaVector,
}
}

OmegaMatrix

Between-subject variability matrix with pre-computed Cholesky factor.

#![allow(unused)]
fn main() {
pub struct OmegaMatrix {
    pub matrix: DMatrix<f64>,       // Full omega matrix
    pub chol: DMatrix<f64>,         // Lower Cholesky factor L (Omega = L*L')
    pub eta_names: Vec<String>,
    pub diagonal: bool,             // True if omega is diagonal
}
}

Constructors:

  • OmegaMatrix::from_matrix(m, names, diagonal) -- From a full matrix (computes Cholesky, regularizes if needed)
  • OmegaMatrix::from_diagonal(variances, names) -- From diagonal variances

FitOptions

#![allow(unused)]
fn main() {
pub struct FitOptions {
    pub method: EstimationMethod,      // Foce, FoceI, or Saem
    pub outer_maxiter: usize,          // Default: 500
    pub outer_gtol: f64,               // Default: 1e-6
    pub inner_maxiter: usize,          // Default: 200
    pub inner_tol: f64,                // Default: 1e-8
    pub run_covariance_step: bool,     // Default: true
    pub interaction: bool,             // FOCEI flag
    pub verbose: bool,                 // Default: true
    pub optimizer: Optimizer,          // Default: Slsqp
    pub lbfgs_memory: usize,           // Default: 5
    pub global_search: bool,           // Default: false
    pub global_maxeval: usize,         // Default: 0 (auto)
    // SAEM-specific
    pub saem_n_exploration: usize,     // Default: 150
    pub saem_n_convergence: usize,     // Default: 250
    pub saem_n_mh_steps: usize,        // Default: 3
    pub saem_adapt_interval: usize,    // Default: 50
    pub saem_seed: Option<u64>,        // Default: None
}
}

Use FitOptions::default() for standard FOCE settings.

FitResult

#![allow(unused)]
fn main() {
pub struct FitResult {
    pub method: EstimationMethod,
    pub converged: bool,
    pub ofv: f64,
    pub aic: f64,
    pub bic: f64,
    pub theta: Vec<f64>,
    pub theta_names: Vec<String>,
    pub omega: DMatrix<f64>,
    pub sigma: Vec<f64>,
    pub covariance_matrix: Option<DMatrix<f64>>,
    pub se_theta: Option<Vec<f64>>,
    pub se_omega: Option<Vec<f64>>,
    pub se_sigma: Option<Vec<f64>>,
    pub subjects: Vec<SubjectResult>,
    pub n_obs: usize,
    pub n_subjects: usize,
    pub n_parameters: usize,
    pub n_iterations: usize,
    pub interaction: bool,
    pub warnings: Vec<String>,
}
}

SubjectResult

Per-subject diagnostics from the fit.

#![allow(unused)]
fn main() {
pub struct SubjectResult {
    pub id: String,
    pub eta: DVector<f64>,
    pub ipred: Vec<f64>,    // Individual predictions
    pub pred: Vec<f64>,     // Population predictions
    pub iwres: Vec<f64>,    // Individual weighted residuals
    pub cwres: Vec<f64>,    // Conditional weighted residuals
    pub ofv_contribution: f64,
}
}

Enums

#![allow(unused)]
fn main() {
pub enum EstimationMethod { Foce, FoceI, Saem }
pub enum PkModel { OneCptIvBolus, OneCptOral, OneCptInfusion,
                   TwoCptIvBolus, TwoCptOral, TwoCptInfusion }
pub enum ErrorModel { Additive, Proportional, Combined }
pub enum Optimizer { Bfgs, Lbfgs, Slsqp, NloptLbfgs, Mma }
}

DoseEvent

#![allow(unused)]
fn main() {
pub struct DoseEvent {
    pub time: f64,
    pub amt: f64,
    pub cmt: usize,
    pub rate: f64,
    pub duration: f64,
    pub ss: bool,
    pub ii: f64,
}
}

Fitting Functions

fit()

The primary estimation entry point. Runs FOCE, FOCEI, or SAEM depending on options.method.

#![allow(unused)]
fn main() {
pub fn fit(
    model: &CompiledModel,
    population: &Population,
    init_params: &ModelParameters,
    options: &FitOptions,
) -> Result<FitResult, String>
}

Parameters:

  • model: Compiled model from parse_model_file() or parse_full_model_file()
  • population: Population data from read_nonmem_csv()
  • init_params: Initial parameter values
  • options: Estimation configuration

Returns: FitResult with parameter estimates, standard errors, and per-subject diagnostics.

Example:

#![allow(unused)]
fn main() {
let model = parse_model_file(Path::new("model.ferx"))?;
let population = read_nonmem_csv(Path::new("data.csv"), None)?;
let options = FitOptions::default();

let result = fit(&model, &population, &model.default_params, &options)?;
println!("OFV: {:.4}", result.ofv);
}

fit_from_files()

Convenience wrapper that handles parsing and data reading.

#![allow(unused)]
fn main() {
pub fn fit_from_files(
    model_path: &str,
    data_path: &str,
    covariate_columns: Option<&[&str]>,
    options: Option<FitOptions>,
) -> Result<FitResult, String>
}

Example:

#![allow(unused)]
fn main() {
let result = fit_from_files(
    "model.ferx",
    "data.csv",
    None,          // Auto-detect covariates
    None,          // Default options
)?;
}

run_model_with_data()

Full pipeline: parse model file, read data, fit. Returns both the fit result and the population.

#![allow(unused)]
fn main() {
pub fn run_model_with_data(
    model_path: &str,
    data_path: &str,
) -> Result<(FitResult, Population), String>
}

Uses the [fit_options] from the model file.

run_model_simulate()

Simulation-estimation: parse model, generate data from [simulation] block, fit.

#![allow(unused)]
fn main() {
pub fn run_model_simulate(
    model_path: &str,
) -> Result<(FitResult, Population), String>
}

Requires a [simulation] block in the model file.

build_fit_inputs()

Extract initial parameters and fit options from a parsed model, separating parsing from estimation for timing purposes.

#![allow(unused)]
fn main() {
pub fn build_fit_inputs(
    parsed: &ParsedModel,
) -> Result<(ModelParameters, FitOptions), String>
}

Example:

#![allow(unused)]
fn main() {
let parsed = parse_full_model_file(Path::new("model.ferx"))?;
let (init_params, options) = build_fit_inputs(&parsed)?;

let population = read_nonmem_csv(Path::new("data.csv"), None)?;
let result = fit(&parsed.model, &population, &init_params, &options)?;
}

Simulation

simulate()

Generate simulated observations from a model with random effects and residual error.

#![allow(unused)]
fn main() {
pub fn simulate(
    model: &CompiledModel,
    population: &Population,
    params: &ModelParameters,
    n_sim: usize,
) -> Vec<SimulationResult>
}

Parameters:

  • model: Compiled model
  • population: Template population (dose events and observation times are used; DV values are ignored)
  • params: True parameter values for simulation
  • n_sim: Number of simulation replicates

Returns: Vector of SimulationResult, one per observation per subject per replicate.

Example:

#![allow(unused)]
fn main() {
let model = parse_model_file(Path::new("model.ferx"))?;
let population = read_nonmem_csv(Path::new("data.csv"), None)?;

// Simulate 1000 replicates
let sims = simulate(&model, &population, &model.default_params, 1000);

for sim in &sims[..5] {
    println!("Sim {}, ID {}, TIME {}, IPRED {:.3}, DV {:.3}",
             sim.sim, sim.id, sim.time, sim.ipred, sim.dv_sim);
}
}

simulate_with_seed()

Same as simulate() but with a fixed random seed for reproducibility.

#![allow(unused)]
fn main() {
pub fn simulate_with_seed(
    model: &CompiledModel,
    population: &Population,
    params: &ModelParameters,
    n_sim: usize,
    seed: u64,
) -> Vec<SimulationResult>
}

predict()

Population predictions without random effects (eta = 0). No simulation noise is added.

#![allow(unused)]
fn main() {
pub fn predict(
    model: &CompiledModel,
    population: &Population,
    params: &ModelParameters,
) -> Vec<PredictionResult>
}

Returns: Vector of PredictionResult with population-level predictions.

Example:

#![allow(unused)]
fn main() {
let preds = predict(&model, &population, &model.default_params);
for p in &preds {
    println!("ID {}, TIME {}, PRED {:.3}", p.id, p.time, p.pred);
}
}

Result Types

#![allow(unused)]
fn main() {
pub struct SimulationResult {
    pub sim: usize,     // Replicate number (1-indexed)
    pub id: String,     // Subject ID
    pub time: f64,      // Observation time
    pub ipred: f64,     // Individual prediction (no residual error)
    pub dv_sim: f64,    // Simulated observation (with residual error)
}

pub struct PredictionResult {
    pub id: String,
    pub time: f64,
    pub pred: f64,      // Population prediction (eta = 0)
}
}

Simulation Process

For each replicate and each subject:

  1. Sample random effects: \( \eta_i \sim N(0, \Omega) \) using the Cholesky factor \( L \): \( \eta = L \cdot z \), where \( z \sim N(0, I) \)
  2. Compute individual PK parameters via pk_param_fn(theta, eta, covariates)
  3. Generate predictions using the structural model
  4. Add residual error: \( DV = IPRED + \sqrt{V} \cdot \epsilon \), where \( \epsilon \sim N(0, 1) \) and \( V \) is the residual variance from the error model

Parsing

Model Parsing

parse_model_file()

Parse a .ferx file into a CompiledModel. Only the core model blocks are processed ([parameters], [individual_parameters], [structural_model], [error_model]).

#![allow(unused)]
fn main() {
pub fn parse_model_file(path: &Path) -> Result<CompiledModel, String>
}

parse_model_string()

Parse a model from a string instead of a file.

#![allow(unused)]
fn main() {
pub fn parse_model_string(content: &str) -> Result<CompiledModel, String>
}

Example:

#![allow(unused)]
fn main() {
let model_str = r#"
[parameters]
  theta TVCL(0.1, 0.001, 10.0)
  theta TVV(10.0, 0.1, 500.0)
  omega ETA_CL ~ 0.09
  sigma ADD_ERR ~ 1.0

[individual_parameters]
  CL = TVCL * exp(ETA_CL)
  V  = TVV

[structural_model]
  pk one_cpt_iv_bolus(cl=CL, v=V)

[error_model]
  DV ~ additive(ADD_ERR)
"#;

let model = parse_model_string(model_str)?;
}

parse_full_model_file()

Parse a complete model file including [fit_options] and [simulation] blocks.

#![allow(unused)]
fn main() {
pub fn parse_full_model_file(path: &Path) -> Result<ParsedModel, String>
}

Returns a ParsedModel which contains:

#![allow(unused)]
fn main() {
pub struct ParsedModel {
    pub model: CompiledModel,
    pub simulation: Option<SimulationSpec>,
    pub fit_options: FitOptions,
}
}

Data Parsing

read_nonmem_csv()

Read a NONMEM-format CSV file into a Population.

#![allow(unused)]
fn main() {
pub fn read_nonmem_csv(
    path: &Path,
    covariate_columns: Option<&[&str]>,
) -> Result<Population, String>
}

Parameters:

  • path: Path to the CSV file
  • covariate_columns: Optional list of covariate column names. If None, all non-standard columns are auto-detected as covariates.

Example:

#![allow(unused)]
fn main() {
// Auto-detect covariates
let pop = read_nonmem_csv(Path::new("data.csv"), None)?;

// Explicit covariate list
let pop = read_nonmem_csv(
    Path::new("data.csv"),
    Some(&["WT", "CRCL", "AGE"]),
)?;

println!("{} subjects, {} observations",
         pop.subjects.len(), pop.n_obs());
}

Data Processing Details

  • Column names are matched case-insensitively
  • Standard NONMEM columns (ID, TIME, DV, EVID, AMT, CMT, RATE, MDV, II, SS) are recognized automatically
  • Missing values (., empty string) are handled appropriately
  • Rows with EVID=1 are treated as dose events
  • Rows with EVID=0 and MDV=0 are treated as observations
  • Time-constant covariates use the first non-missing value per subject
  • Time-varying covariates use Last Observation Carried Forward (LOCF)

Frequently Asked Questions

Do I need to use MU-referencing in my model definitions, like in NONMEM / nlmixr2?

No. FeRx does not require MU-referencing.

In NONMEM and nlmixr2, MU-referencing is a convention in which each random effect ETA(i) is linearly associated with a single MU_i term (typically MU_i = LOG(THETA(i))), so individual parameters look like:

MU_1 = LOG(THETA(1))
CL = EXP(MU_1 + ETA(1))

This structure is required by those tools' SAEM implementations because they rely on conjugate Gibbs updates that are only valid when the MU_iETA(i) relationship is strictly linear (and typically on a log scale). If you deviate — for example by writing CL = THETA(1) * EXP(ETA(1)) without going through an intermediate MU_1, or by mixing multiple etas into one parameter — NONMEM SAEM will either reject the model or silently produce biased estimates.

FeRx's SAEM implementation uses Metropolis-Hastings sampling for the E-step rather than Gibbs, which does not require MU-referencing. You can write individual parameters in any form you like:

# All of these work fine in ferx:
CL = TVCL * exp(ETA_CL)
CL = TVCL + ETA_CL                              # additive eta
CL = TVCL * (WT/70)^0.75 * exp(ETA_CL)          # with covariates
CL = TVCL * exp(ETA_CL + ETA_CL_OCC)            # multiple etas
VMAX = TVVMAX * exp(ETA_VMAX)
KM   = TVKM                                      # no eta at all

The FOCE / FOCEI estimators have no MU-referencing requirement in any NLME tool — they use MAP optimization over etas regardless of parameterization. This is equally true in ferx.

Performance implication

The main tradeoff is that MH sampling is slightly less efficient per iteration than conjugate Gibbs for MU-referenced models. In practice:

  • For models with a few random effects, the difference is negligible
  • For models with many (>10) random effects, conjugate Gibbs would converge in fewer iterations — but in exchange you gain flexibility to write models that don't fit the MU-referenced mold

If you have a NONMEM model that uses MU-referencing and want to port it to ferx, you can drop the MU intermediate step and write the individual parameters directly — the results will be equivalent.