This document explains how to use C headers from installed R packages in a miniextendr Rust crate. The technique uses bindgen to generate Rust FFI bindings at development time, and R’s standard build system to compile the required C shim files.

πŸ”—Overview

Many R packages expose C APIs via inst/include/ headers. These headers typically use R_GetCCallable() to resolve function pointers at runtime, wrapped in static R_INLINE functions. Examples: cli (progress bars), nanoarrow (Arrow C data interface), vctrs (vector types), processx (process management).

The integration has three layers:

  1. bindgen (development time) β€” parses the C headers and generates:

    • A Rust FFI module (*_ffi.rs) with extern "C" declarations
    • A C shim file (*_static_wrappers.c) that wraps static inline functions into normal linkable symbols
  2. R’s build system (install time) β€” compiles the C shims into .o files alongside stub.c and any other C sources in src/

  3. Makevars β€” passes all compiled .o files to cargo as link arguments so both the cdylib (for wrapper generation) and the staticlib (for the final .so) can resolve the shim symbols

πŸ”—Step-by-step: adding a native R package

πŸ”—1. Run bindgen to generate the FFI

R_INCLUDE="$(Rscript -e 'cat(R.home("include"))')"
PKG_INCLUDE="$(Rscript -e 'cat(system.file("include", package = "cli"))')"

# Create wrapper header
cat > src/cli_wrapper.h << 'EOF'
#include <Rinternals.h>
#include <cli/progress.h>
EOF

# Run bindgen
bindgen \
  --merge-extern-blocks \
  --no-layout-tests \
  --no-doc-comments \
  --wrap-static-fns \
  --wrap-static-fns-path src/cli_static_wrappers.c \
  --allowlist-file '.*/cli/progress\.h' \
  --blocklist-type 'SEXPREC' \
  --blocklist-type 'SEXP' \
  --raw-line 'use miniextendr_api::ffi::SEXP;' \
  src/cli_wrapper.h \
  -- \
  -I"$R_INCLUDE" \
  -I"$PKG_INCLUDE" \
  > src/rust/native/cli_ffi.rs

Key bindgen flags:

FlagPurpose
--wrap-static-fnsGenerates C shim wrappers for static and static inline functions
--wrap-static-fns-pathWhere to write the C shim file
--allowlist-fileOnly emit bindings for declarations from matching files
--blocklist-type SEXPRECDon’t re-define SEXPREC (already in miniextendr-api)
--blocklist-type SEXPDon’t re-define SEXP
--raw-line 'use ...'Import miniextendr’s SEXP type instead
--merge-extern-blocksCombine all extern "C" declarations into one block
--no-layout-testsSkip layout verification tests
--no-doc-commentsOmit C doc comments from output

πŸ”—2. Fix the C shim include path

bindgen writes the C shim with an absolute include path. Change it to relative:

// Before (wrong):
#include "/absolute/path/to/src/cli_wrapper.h"

// After (correct):
#include "cli_wrapper.h"

πŸ”—3. Add #![allow(...)] to the generated Rust file

At the top of the generated cli_ffi.rs, add:

#![allow(unused, non_camel_case_types, non_upper_case_globals, clippy::all)]

πŸ”—4. What gets generated

src/cli_wrapper.h β€” bridge header that includes R and the package headers:

#include <Rinternals.h>
#include <cli/progress.h>

src/cli_static_wrappers.c β€” C shims for static inline functions. Each function foo() becomes foo__extern():

#include "cli_wrapper.h"

void cli_progress_done__extern(SEXP bar) { cli_progress_done(bar); }
int cli_progress_num__extern(void) { return cli_progress_num(); }
// ... one per static inline function

src/rust/native/cli_ffi.rs β€” Rust FFI declarations with __extern link names:

use miniextendr_api::ffi::SEXP;

unsafe extern "C" {
    #[link_name = "cli_progress_num__extern"]
    pub fn cli_progress_num() -> ::std::os::raw::c_int;
    // ...
}

The #[link_name = "cli_progress_num__extern"] tells the linker to find the cli_progress_num__extern symbol (provided by the C shim), even though Rust code calls it as cli_progress_num().

πŸ”—5. Wire into the Rust crate

Create src/rust/native.rs:

pub mod cli_ffi;

Add to src/rust/lib.rs:

mod native;

Use the FFI:

use crate::native::cli_ffi;

#[miniextendr]
pub fn cli_active_progress_bars() -> i32 {
    unsafe { cli_ffi::cli_progress_num() }
}

πŸ”—6. Update DESCRIPTION

Two entries needed:

LinkingTo: cli
Imports: cli

LinkingTo tells R to add -I<cli-include-path> when compiling C files. Imports ensures cli’s DLL is loaded at runtime (required for R_GetCCallable() to resolve symbols).

You also need an @importFrom in at least one roxygen block to trigger the NAMESPACE import:

/// @importFrom cli cli_progress_bar
#[miniextendr]
pub fn cli_active_progress_bars() -> i32 { ... }

This causes importFrom(cli,cli_progress_bar) in NAMESPACE, which forces R to load cli’s DLL when your package loads.

πŸ”—7. Update configure.ac

Add native package include discovery:

dnl ---- Native R package include paths ----
NATIVE_PKG_CPPFLAGS=""

CLI_INCLUDE=$("${R_HOME}/bin/Rscript" -e "cat(system.file('include', package='cli'))")
if test -n "$CLI_INCLUDE" && test -d "$CLI_INCLUDE"; then
  NATIVE_PKG_CPPFLAGS="$NATIVE_PKG_CPPFLAGS -I$CLI_INCLUDE"
  AC_MSG_NOTICE([cli include: $CLI_INCLUDE])
fi
AC_SUBST([NATIVE_PKG_CPPFLAGS])

πŸ”—8. Update Makevars.in β€” the OBJECTS pattern

This is the critical piece. R’s build system automatically compiles all .c files in src/ into .o files and collects them in $(OBJECTS). By placing the C shim files (cli_static_wrappers.c, cli_wrapper.h) in src/, they compile automatically.

The key change: pass all $(OBJECTS) to cargo as link arguments. This makes the shim symbols available to both the cdylib and staticlib Rust builds.

# Add include paths for native package headers
PKG_CPPFLAGS = $(NATIVE_PKG_CPPFLAGS)

# The cargo staticlib target now depends on $(OBJECTS) and passes them as link args
$(CARGO_AR): FORCE_CARGO $(WRAPPERS_R) $(OBJECTS)
    @set -e; \
    TARGET_OPT=""; \
    LINK_ARGS=""; \
    for obj in $(OBJECTS); do \
      LINK_ARGS="$$LINK_ARGS -C link-arg=$(ABS_RPKG_SRCDIR)/$$obj"; \
    done; \
    if [ -n "$(CARGO_BUILD_TARGET)" ]; then \
      TARGET_OPT="--target $(CARGO_BUILD_TARGET)"; \
    fi; \
    RUSTFLAGS="$(ENV_RUSTFLAGS) $$LINK_ARGS" \
    $(CARGO) $(RUST_TOOLCHAIN) build $(CARGO_OFFLINE_FLAG) \
      $(CARGO_FEATURES_FLAG) $$TARGET_OPT \
      --lib --profile $(CARGO_PROFILE) \
      --manifest-path $(CARGO_TOML) \
      --target-dir $(CARGO_TARGET_DIR); \
    test -f "$(CARGO_AR)"

# Same pattern for the cdylib (wrapper generation)
$(CARGO_CDYLIB): FORCE_CARGO $(OBJECTS)
    @set -e; \
    TARGET_OPT=""; \
    CDYLIB_LINK_ARGS=""; \
    for obj in $(OBJECTS); do \
      CDYLIB_LINK_ARGS="$$CDYLIB_LINK_ARGS -C link-arg=$(ABS_RPKG_SRCDIR)/$$obj"; \
    done; \
    # ... rest of cdylib build ...
    RUSTFLAGS="$(ENV_RUSTFLAGS)" \
    $(CARGO) $(RUST_TOOLCHAIN) rustc ... \
      -- $$CDYLIB_LINK_ARGS

How the OBJECTS pattern works:

  1. R’s build system compiles stub.c β†’ stub.o and cli_static_wrappers.c β†’ cli_static_wrappers.o
  2. These go into $(OBJECTS) automatically
  3. The for obj in $(OBJECTS) loop converts each .o file to a -C link-arg=/absolute/path/to/obj.o RUSTFLAG
  4. Cargo passes these to the linker, making the *__extern symbols available when linking the Rust crate
  5. Both the cdylib (temporary, for wrapper generation) and the staticlib (permanent, for the final .so) get the symbols

Why this works for both cdylib and staticlib:

  • The cdylib is a shared library that cargo builds for R wrapper generation. It needs the shim symbols to link successfully.
  • The staticlib is the archive that becomes part of the final R package .so. The *__extern symbols are resolved when R links $(OBJECTS) + $(CARGO_AR) into miniextendr.so.

πŸ”—9. File layout

After setup, your src/ directory looks like:

src/
β”œβ”€β”€ cli_wrapper.h              # Bridge header (Rinternals.h + cli/progress.h)
β”œβ”€β”€ cli_static_wrappers.c      # C shims for static inline functions
β”œβ”€β”€ stub.c                     # Minimal C stub for R's build system
β”œβ”€β”€ Makevars.in                # Build rules (configure template)
└── rust/
    β”œβ”€β”€ lib.rs                 # Rust crate root (has: mod native;)
    β”œβ”€β”€ native.rs              # Module declarations (has: pub mod cli_ffi;)
    β”œβ”€β”€ native/
    β”‚   └── cli_ffi.rs         # bindgen-generated Rust FFI
    β”œβ”€β”€ native_cli_test.rs     # Test/demo using the FFI
    └── Cargo.toml

πŸ”—Why static inline functions need shims

Most R packages that export C APIs use this pattern in their headers:

static R_INLINE int cli_progress_num(void) {
    static int (*ptr)(void) = NULL;
    if (ptr == NULL) {
        ptr = (int (*)(void)) R_GetCCallable("cli", "cli_progress_num");
    }
    return ptr();
}

These are static inline β€” they exist only in the header file, not in any compiled library. bindgen can’t just declare them as extern "C" because there’s no compiled symbol to link against.

The --wrap-static-fns flag solves this: bindgen generates a C file with non-inline wrapper functions:

int cli_progress_num__extern(void) {
    return cli_progress_num();  // calls the static inline version
}

The Rust FFI then links against cli_progress_num__extern instead of cli_progress_num.

πŸ”—Runtime resolution

At runtime, the call chain is:

Rust: cli_ffi::cli_progress_num()
  β†’ linker resolves to: cli_progress_num__extern  (C shim in your package)
    β†’ calls: cli_progress_num  (static inline from cli/progress.h)
      β†’ first call: R_GetCCallable("cli", "cli_progress_num")  (resolves DLL symbol)
      β†’ subsequent calls: cached function pointer (fast path)

The R_GetCCallable mechanism requires the cli package’s DLL to be loaded. This is why importFrom(cli, cli_progress_bar) in NAMESPACE is essential β€” it triggers library.dynam("cli", ...) during package loading.

πŸ”—Corpus: which packages work with bindgen

308 of 594 tested CRAN packages (52%) work with bindgen when using C++ mode. See dev/bindgen-compatible-packages-v3.csv for the full list.

πŸ”—Progression

VersionFlags addedSuccesses
v1C-only, no special flags69 / 594 (12%)
v2+ R_NO_REMAP + -x c++17 + --enable-cxx-namespaces204 / 594 (34%)
v3+ -isysroot + LinkingTo resolution + c++14 fallback308 / 594 (52%)
bindgen \
  --enable-cxx-namespaces \
  --merge-extern-blocks \
  --no-layout-tests \
  --no-doc-comments \
  --wrap-static-fns \
  --wrap-static-fns-path "$STATIC_C" \
  --blocklist-type 'SEXPREC' \
  --blocklist-type 'SEXP' \
  --raw-line 'use miniextendr_api::ffi::SEXP;' \
  "$WRAPPER" \
  -- \
  -x c++ -std=c++17 \
  -isysroot "$(xcrun --show-sdk-path)" \
  -I"$R_INCLUDE" \
  -I"$PKG_INCLUDE" \
  -I"$TRANSITIVE_DEP_INCLUDES"

The wrapper header must define R_NO_REMAP before including Rinternals.h:

#define R_NO_REMAP
#include <Rinternals.h>
#include <pkg/header.h>

πŸ”—Why R_NO_REMAP is essential

R’s Rinternals.h defines macros like #define length Rf_length, #define error Rf_error, #define allocVector Rf_allocVector. These collide with C++ identifiers β€” rapidjson::Document has a length member, for instance. R_NO_REMAP suppresses these macros, keeping only the Rf_ prefixed versions.

πŸ”—Why always use -x c++

Many .h files in R packages contain C++ code (#include <string>, #include <cmath>, templates, namespaces). Using -x c++ for all headers avoids misclassification. bindgen handles pure C code fine in C++ mode.

πŸ”—Workaround: bindgen panics on Boost anonymous types

bindgen 0.72.1 panics with "/*<unnamed>*/" is not a valid Ident when processing anonymous struct types inside Boost headers (e.g., through wdm β†’ boost transitive includes). The workaround:

--blocklist-file '.*/boost/.*'
--blocklist-file '.*/wdm/.*'

This prevents bindgen from constructing IR for Boost internals while still allowing the package’s own headers to reference Boost types opaquely. The package’s public API bindings generate correctly.

Tested: svines (25k lines) and vinereg (31k lines) β€” both produce valid bindings with the blocklist.

πŸ”—Remaining failure categories (286 packages)

CategoryCountCauseFixable?
cxx_stdlib122Deep Rcpp/RcppArmadillo dependency chainsPartially β€” needs recursive LinkingTo resolution
compile_error80C++ template errors, deprecated APIsNo β€” package-specific issues
missing_header59System libs (HDF5, GL, petscsnes)Yes β€” install system deps
rcpp_dep9Direct #include <Rcpp.h>No β€” Rcpp ecosystem
bindgen_panic2anonymous types in Boost/wdm headersYes β€” --blocklist-file '.*/boost/.*'

πŸ”—Notable working packages

PackageModeStdLinesStatic fns
clicβ€”969yes
nanoarrowcβ€”1,257yes
vctrscβ€”959no
processxcβ€”1,280yes
wkcβ€”981no
checkmatecβ€”1,246yes
nloptrcppc++1712,394yes
BH (Boost subset)cppc++1718,767yes
AsioHeaderscppc++1724,664yes
piton (PEGTL)cppc++1715,661no
rjsonconscppc++1713,374no
openxlsx2cppc++1716,507no
dqrngcppc++1713,134no
ipaddresscppc++1722,743no

πŸ”—Applying to mirai/NNG

The NNG integration in the mirai worktree currently bundles NNG and mbedtls source code in src/nng/ and src/mbedtls/, compiling them via explicit Makevars source lists and pattern rules. This produces static archives (libnng.a, libmbedtls_all.a) that are passed to cargo via -C link-arg=.

The OBJECTS pattern is a cleaner alternative when the C sources are simple enough to live directly in src/. For NNG this may not apply directly (NNG needs platform-specific defines and has deep subdirectories), but the OBJECTS link-arg loop should still be used to pass stub.o and any other src/*.c objects to cargo. This ensures any future C shim files (e.g., for inline functions from other R packages) get linked automatically.

The key change for mirai:

# Instead of hardcoding PKG_LIBS:
PKG_LIBS = $(CARGO_AR) $(LIBNNG) $(LIBMBEDTLS) $(NNG_LIBS)

# Pass OBJECTS to cargo too:
$(CARGO_AR): FORCE_CARGO $(WRAPPERS_R) $(OBJECTS) $(LIBNNG) $(LIBMBEDTLS)
    @set -e; \
    LINK_ARGS=""; \
    for obj in $(OBJECTS); do \
      LINK_ARGS="$$LINK_ARGS -C link-arg=$(ABS_RPKG_SRCDIR)/$$obj"; \
    done; \
    LINK_ARGS="$$LINK_ARGS -C link-arg=$(ABS_RPKG_SRCDIR)/$(LIBNNG)"; \
    LINK_ARGS="$$LINK_ARGS -C link-arg=$(ABS_RPKG_SRCDIR)/$(LIBMBEDTLS)"; \
    RUSTFLAGS="$(ENV_RUSTFLAGS) $$LINK_ARGS" \
    $(CARGO) ... build ...

This way adding a new R package header binding (e.g., for later or processx) is just: drop the wrapper .h and _static_wrappers.c in src/, add LinkingTo:, and it compiles automatically.

πŸ”—Known limitations

πŸ”—--wrap-static-fns only works in C mode

bindgen’s --wrap-static-fns flag generates C shim wrappers for static and static inline functions. This only works when parsing headers in C mode (-x c). In C++ mode (-x c++), the flag is silently ignored β€” no *_static_wrappers.c file is generated.

This matters for R packages that use the R_GetCCallable() pattern via static R_INLINE functions (e.g., cli, nanoarrow). For these packages, use_native_package() detects them as pure C and uses C mode, preserving static wrapper generation. For C++ packages that also have static inline functions, users would need to write the C shim manually or invoke bindgen separately in C mode for those functions.

πŸ”—Windows

The -isysroot flag is macOS-specific. On Windows (MSYS2/MinGW), the C++ stdlib is provided differently. The Makevars.win template does not yet include NATIVE_PKG_CPPFLAGS or the OBJECTS link-arg pattern. Windows support requires:

  • Detecting the MinGW C++ include path
  • Updating Makevars.win / configure.win templates
  • Testing with R CMD INSTALL on Windows

πŸ”—LinkingTo resolution

resolve_include_paths() walks the LinkingTo dependency tree recursively via BFS. However, some packages have LinkingTo deps that aren’t installed (e.g., Bioconductor packages). Missing deps are silently skipped β€” the include path just won’t be added, and bindgen will fail with β€œfile not found” for headers from those deps.