This guide documents how miniextendr converts between R and Rust types, including NA handling, coercion rules, and edge cases.

πŸ”—Basic Type Mappings

πŸ”—Scalar Types

R TypeRust TypeNotes
integer (length 1)i32NA β†’ panic
numeric (length 1)f64NA preserved as NA_REAL
logical (length 1)boolNA β†’ panic
character (length 1)String, &strNA β†’ panic
raw (length 1)u8No NA in raw
complex (length 1)RcomplexHas real/imag NA

πŸ”—Vector Types

R TypeRust TypeNotes
integerVec<i32>, &[i32]NA = i32::MIN
numericVec<f64>, &[f64]NA = special bit pattern
logicalVec<i32>TRUE=1, FALSE=0, NA=i32::MIN
characterVec<String>NA β†’ panic
rawVec<u8>, &[u8]No NA
listVariousSee Lists and Collections sections

πŸ”—Nested Collection Types

miniextendr supports converting nested collections to R lists:

Rust TypeR TypeNotes
Vec<Vec<T>>list of vectorsFor T: RNativeType or T = String
Vec<Box<[T]>>list of vectorsBoxed slices β†’ vectors
Vec<[T; N]>list of vectorsFixed arrays β†’ vectors
Vec<HashSet<T>>list of vectorsSets β†’ unordered vectors
Vec<BTreeSet<T>>list of vectorsSets β†’ sorted vectors

These are particularly useful with #[derive(DataFrameRow)] where row fields can contain collections.

πŸ”—Option Types (NA-Safe)

R TypeRust TypeNA Handling
integerOption<i32>NA β†’ None
numericOption<f64>NA β†’ None
logicalOption<bool>NA β†’ None
characterOption<String>NA β†’ None

πŸ”—ALTREP-Aware Types

R frequently passes ALTREP vectors (e.g., 1:10, seq_len(N)) to Rust. All parameter types handle this transparently:

Rust TypeALTREP Handling
Vec<i32>, &[f64], etc.Auto-materialized during conversion
SEXPAuto-materialized via ensure_materialized
AltrepSexpAccepted only if ALTREP, !Send + !Sync

See Receiving ALTREP from R for details.


πŸ”—NA Value Representation

πŸ”—Integer NA

pub const NA_INTEGER: i32 = i32::MIN;  // -2147483648

In R, NA_integer_ is represented as i32::MIN. This means:

  • Valid integers: -2147483647 to 2147483647
  • i32::MIN is reserved for NA

Implication: You cannot represent i32::MIN as a valid value in R integers.

πŸ”—Logical NA

pub const NA_LOGICAL: i32 = i32::MIN;  // Same as integer

R logicals are stored as integers internally:

  • TRUE = 1
  • FALSE = 0
  • NA = i32::MIN

πŸ”—Real (Double) NA

pub const NA_REAL: f64 = f64::from_bits(0x7FF0_0000_0000_07A2);

R’s NA_real_ is a specific IEEE 754 NaN with a particular bit pattern.

Critical: This is different from regular f64::NAN:

// These are DIFFERENT values
let na = NA_REAL;           // R's NA
let nan = f64::NAN;         // Regular IEEE NaN

// Detection requires bit comparison
fn is_na_real(value: f64) -> bool {
    value.to_bits() == NA_REAL.to_bits()
}

// Regular NaN check does NOT detect NA
value.is_nan()  // Returns true for both NA and NaN

Implication: When working with f64 vectors, regular NaN values pass through unchanged. Only NA_REAL is treated as NA.

πŸ”—String NA

R’s NA_character_ is a special CHARSXP pointer (R_NaString).

miniextendr converts string NA to panic by default. Use Option<String> for NA-safe access:

#[miniextendr]
pub fn handle_string(s: Option<String>) -> String {
    s.unwrap_or_else(|| "was NA".to_string())
}

πŸ”—Coercion System

miniextendr provides automatic type coercion for numeric types.

πŸ”—Coercion Precedence

Two traits control coercion:

  1. Coerce<R> - Infallible (always succeeds)
  2. TryCoerce<R> - Fallible (can fail)

When both exist for a type pair, Coerce takes precedence:

// Blanket impl ensures Coerce always wins
impl<T, R> TryCoerce<R> for T where T: Coerce<R> {
    fn try_coerce(self) -> Result<R, Infallible> {
        Ok(self.coerce())
    }
}

πŸ”—Infallible Coercions (Coerce)

FromToNotes
i32f64Widening (no precision loss)
i32i32Identity
f64f64Identity
Option<T>TNone β†’ NA value

πŸ”—Fallible Coercions (TryCoerce)

FromToFails When
f64i32NaN, infinity, fractional, overflow
i32u32Negative value
i32NonZeroI32Zero value
f64u64Negative, NaN, overflow

πŸ”—Enabling Coercion

Use #[miniextendr(coerce)] to enable automatic coercion:

// Without coerce: f64 parameter requires numeric input
#[miniextendr]
pub fn square(x: f64) -> f64 { x * x }

// With coerce: accepts integer, coerces to f64
#[miniextendr(coerce)]
pub fn square_coerce(x: f64) -> f64 { x * x }
square(2L)        # Error: expected numeric
square_coerce(2L) # 4.0 (integer coerced to double)

πŸ”—Per-Parameter Coercion

#[miniextendr]
pub fn mixed(
    #[miniextendr(coerce)] x: f64,  // Coerce this one
    y: i32,                          // No coercion
) -> f64 {
    x + y as f64
}

πŸ”—Option-to-NA Conversion

When returning Option<T>, None converts to R’s NA:

#[miniextendr]
pub fn maybe_value(x: i32) -> Option<i32> {
    if x > 0 { Some(x) } else { None }
}
maybe_value(5)   # 5
maybe_value(-1)  # NA

πŸ”—Coercion for Options

Option<T> coerces to T with None β†’ NA:

// This works with coercion enabled:
// R's NA_integer_ β†’ None β†’ coerced to NA_real_
#[miniextendr(coerce)]
pub fn option_coerce(x: f64) -> f64 { x }

πŸ”—Vector NA Handling

πŸ”—Reading Vectors with NA

For vectors with potential NA values, use Option element type:

#[miniextendr]
pub fn count_na(x: Vec<Option<i32>>) -> i32 {
    x.iter().filter(|v| v.is_none()).count() as i32
}

πŸ”—Writing Vectors with NA

Return Vec<Option<T>> to include NA values:

#[miniextendr]
pub fn add_na_at_end(x: Vec<i32>) -> Vec<Option<i32>> {
    let mut result: Vec<Option<i32>> = x.into_iter().map(Some).collect();
    result.push(None);  // Adds NA
    result
}

πŸ”—Slice Lifetimes

When using slice parameters (&[T]), be aware of lifetime implications:

// SAFE: Slice is only used during function execution
#[miniextendr]
pub fn sum(x: &[f64]) -> f64 {
    x.iter().sum()
}

The slice has a 'static lifetime annotation, but this is a lie for API convenience. The actual lifetime is tied to R’s GC protection of the SEXP.

Safe patterns:

  • Use slice within the function
  • Copy data if you need to store it

Unsafe patterns:

  • Storing the slice in a struct that outlives the function
  • Returning the slice (won’t compile anyway)

πŸ”—String Lifetimes

R interns all strings (CHARSXP). When you get a &str from R:

#[miniextendr]
pub fn process_string(s: &str) -> String {
    // s is valid for the entire R session (interned)
    s.to_uppercase()
}

The &'static str lifetime is actually correct here because R never garbage collects interned strings.


πŸ”—ExternalPtr Semantics

When using #[derive(ExternalPtr)]:

#[derive(ExternalPtr)]
pub struct MyData {
    values: Vec<f64>,
}

The Rust data is heap-allocated and owned by R:

  1. new() allocates Rust data on heap
  2. Pointer stored in R’s external pointer SEXP
  3. R’s GC tracks the SEXP
  4. When SEXP is collected, Rust Drop runs
  5. Heap memory freed

Thread safety: The pointer can be safely accessed from any thread, but R API calls must happen on the main thread.


πŸ”—Complex Types

πŸ”—Lists

Lists convert to various Rust types:

// Named list β†’ HashMap
#[miniextendr]
pub fn process_map(x: HashMap<String, i32>) -> i32 {
    x.values().sum()
}

// List β†’ Vec of heterogeneous items (requires SEXP)
#[miniextendr]
pub fn list_length(x: List) -> i32 {
    x.len() as i32
}

πŸ”—Data Frames

Data frames are lists with special attributes. Access columns:

#[miniextendr]
pub fn get_column(df: List, name: &str) -> Vec<f64> {
    // df[name] returns the column
    // Convert as needed
}

πŸ”—Matrices

With the ndarray feature:

use ndarray::Array2;

#[miniextendr]
pub fn matrix_sum(x: Array2<f64>) -> f64 {
    x.sum()
}

πŸ”—Error Cases

πŸ”—Type Mismatch

When R type doesn’t match expected Rust type:

#[miniextendr]
pub fn needs_integer(x: i32) -> i32 { x }
needs_integer(1.5)
# Error: failed to convert parameter 'x' to i32: wrong type

πŸ”—NA in Non-Option

When NA is passed to non-Option parameter:

#[miniextendr]
pub fn needs_value(x: i32) -> i32 { x }
needs_value(NA_integer_)
# Error: failed to convert parameter 'x' to i32: contains NA

πŸ”—Coercion Failure

When coercion fails:

#[miniextendr(coerce)]
pub fn needs_int(x: i32) -> i32 { x }
needs_int(1.5)
# Error: failed to coerce parameter 'x' to i32: fractional value

πŸ”—Feature-Gated Types

Many additional types are available via Cargo features:

FeatureTypes
num-bigintBigInt, BigUint
rust_decimalDecimal
uuidUuid
timeDate, Time, OffsetDateTime
ndarrayArray1, Array2, etc.
nalgebraMatrix, Vector, etc.
indexmapIndexMap, IndexSet
serdeJSON conversion
serde_rNative R serialization

Enable in Cargo.toml:

[dependencies]
miniextendr-api = { version = "0.1", features = ["uuid", "time"] }

πŸ”—Best Practices

  1. Use Option<T> for NA-safe parameters

    pub fn safe(x: Option<i32>) -> i32 { x.unwrap_or(0) }
  2. Use slices for read-only vector access (zero-copy)

    pub fn sum(x: &[f64]) -> f64 { x.iter().sum() }
  3. Use Vec<T> when you need to modify

    pub fn double(x: Vec<i32>) -> Vec<i32> { x.into_iter().map(|v| v*2).collect() }
  4. Enable coercion for flexible numeric APIs

    #[miniextendr(coerce)]
    pub fn flexible(x: f64) -> f64 { x }
  5. Return Option<T> to produce NA values

    pub fn maybe(x: i32) -> Option<i32> { if x > 0 { Some(x) } else { None } }

πŸ”—Named Lists

R lists with names can be accessed via NamedList, which builds a HashMap index for O(1) lookup:

use miniextendr_api::NamedList;

#[miniextendr]
pub fn get_option(config: NamedList) -> Option<String> {
    config.get::<String>("name")
}
MethodDescription
get::<T>(name)O(1) lookup by name, converting to type T
get_raw(name)O(1) lookup returning raw SEXP
contains(name)Check if a name exists
get_index::<T>(i)Positional access (no name lookup)
len() / is_empty()Size queries

When to use: List::get_named() is fine for a single lookup. Use NamedList when you need multiple lookups on the same list (O(n) build + O(1) per lookup vs O(n) per lookup).

NamedList implements TryFromSexp, so it can be used directly as a function parameter. NA and empty-string names are excluded from the index; duplicate names resolve to the last occurrence.


πŸ”—Safe Mutable Input

R vectors are copy-on-write, so &mut [T] is not supported in #[miniextendr] functions (rejected at compile time with a helpful error). Use Vec<T> for copy-in/copy-out mutation:

#[miniextendr]
pub fn double_in_place(mut x: Vec<f64>) -> Vec<f64> {
    for v in x.iter_mut() {
        *v *= 2.0;
    }
    x // copies out to a new R vector on return
}

Vec<T> copies the R vector on input (TryFromSexp), allows mutation, and copies out to a new R vector on return (IntoR).


πŸ”—Known Limitations

  • Mutable slice parameters (&mut [T]) are rejected at compile time. Accept &[T] and return a new Vec<T>, or accept Vec<T> directly.
  • String matrices (ndarray::Array<String, Ix2>) are not directly convertible because R’s STRSXP is not contiguous memory. Use Vec<Vec<String>> as an intermediary.
  • SEXP slice lifetimes use 'static for convenience, but actual lifetime is tied to GC protection scope.

See GAPS.md for the full catalog of known limitations and workarounds.


πŸ”—See Also