Skip to content

MRTKLIB Test Accuracy Methodology

Overview

MRTKLIB has three complementary test tiers:

Tier What is measured Pass criterion
Tier 1 — Relative (porting correctness) MRTKLIB output vs upstream output (same input) 3D RMS < tolerance; fix-rate delta ≥ threshold
Tier 2 — Absolute (geodetic accuracy) MRTKLIB output vs surveyed ground truth (SINEX / GSI F5) 1σ and 95% < tolerance or < ref precision
Tier 3 — Precision (position scatter) Spread of solutions around the session centroid 1σ and 95% < tolerance (no external reference)
Text Only
Tier 1: MRTKLIB output  vs  upstream output       (porting correctness)
Tier 2: MRTKLIB output  vs  known station coord   (geodetic accuracy)
Tier 3: MRTKLIB output  vs  session centroid      (precision / scatter)

Reference File Generation

All reference files are pre-computed from upstream tools and committed to the repository. They are not regenerated at test time.

Test group Reference files Generated by
SPP / PPP / PPP-AR tests/data/madocalib/*.pos upstream MADOCALIB (LAPACK build)
PPP-RTK / VRS-RTK tests/data/claslib/ref_*.nmea upstream claslib
SPP (MALIB) tests/data/malib/*.pos upstream MALIB

MADOCALIB PPP-AR reference

The PPP-AR and PPP-AR+iono reference .pos files (pppar.pos, pppar_ion.pos) are generated from upstream MADOCALIB built with -DLAPACK -framework Accelerate (macOS Accelerate framework), matching the solver used by MRTKLIB. See release-notes-v0.3.1.md for background.

claslib PPP-RTK / VRS-RTK reference

Reference files contain NMEA GGA sentences only. Regenerate with:

Bash
bash tests/data/claslib/generate_reference.sh


Comparison Method

Step 1 — Time-key matching

Both files (reference and MRTKLIB output) are parsed into dictionaries keyed by GPS time string. Only common epochs (epochs present in both files) are used.

Text Only
ref_data  = { "2025/04/01 00:00:00.000" : (lat, lon, h, Q), ... }
test_data = { "2025/04/01 00:00:00.000" : (lat, lon, h, Q), ... }
common    = sorted(ref_data.keys() & test_data.keys())

Step 2 — Per-epoch ENU error

For each common epoch, the two coordinates are differenced in ECEF and projected into the local ENU frame at the reference position:

Text Only
ref_xyz  = blh2xyz(ref_lat, ref_lon, ref_h)    # WGS84 → ECEF
test_xyz = blh2xyz(test_lat, test_lon, test_h)
dx       = test_xyz − ref_xyz                   # ECEF difference
enu      = xyz2enu(dx, ref_lat, ref_lon)        # → local ENU [m]

The ENU origin shifts epoch-by-epoch with the reference position.

Step 3 — 3D RMS and fix-rate delta

Text Only
3D error per epoch = ||enu||
3D RMS             = sqrt( mean( 3D_error² ) )
fix rate           = fraction of epochs with Q = 1 (fix) or Q = 6 (PPP)
fix_delta          = test_fix_rate − ref_fix_rate

Step 4 — Pass/Fail criteria

A test passes when both of the following hold:

Criterion Threshold
3D RMS < tolerance see table below
fix_delta ≥ −X% −1.0% for PPP/PPP-AR, −5.0% for PPP-RTK/VRS-RTK

Tolerance Values

Tolerances encode the expected residual difference between MRTKLIB and upstream after accounting for known numerical divergence sources.

Test Tolerance Actual RMS Margin Notes
madocalib_ppp_check 0.005 m < 0.5 cm ~50% Deterministic
madocalib_pppar_check (LAPACK) 0.008 m 0.41 cm ~50% pppiono_t heap vs embed
madocalib_pppar_check (no LAPACK) 0.020 m ~1.5 cm ~25% LU vs LAPACK divergence
madocalib_pppar_ion_check (LAPACK) 0.005 m 0.25 cm ~50% Same root cause
madocalib_pppar_ion_check (no LAPACK) 0.040 m ~3.8 cm ~5% LU vs LAPACK divergence
claslib_ppp_rtk_check 0.10 m 5.9 cm ~40% RTK convergence variability
claslib_vrs_rtk_check 0.10 m 3.3 cm ~67% RTK convergence variability
claslib_ppp_rtk_st12_check 0.15 m 10.8 cm ~28% Fewer ST12 messages
dual-channel tests 0.20 m Float-only, fix rate skipped

LAPACK conditional: when CMake detects LAPACK_FOUND=FALSE, the madocalib_pppar_*_check tests fall back to the wider (no-LAPACK) tolerances automatically. See CMakeLists.txt lines near _PPPAR_TOL.


Comparison Scripts

Tier 1 — Relative scripts

Script Input format Used for
scripts/tests/compare_pos.py RTKLIB .pos (lat/lon/h/Q per epoch) PPP, PPP-AR
scripts/tests/compare_nmea.py NMEA GGA sentences PPP-RTK, VRS-RTK

Both scripts implement the same algorithm (time-key matching → ENU error → 3D RMS + fix-rate delta) and share the same pass/fail logic. The only difference is the input parser.

Tier 2 — Absolute scripts

Script Input format Used for
scripts/tests/compare_pos_abs.py RTKLIB .pos vs SINEX or GSI F5 PPP-AR absolute check
scripts/tests/compare_nmea_abs.py NMEA GGA vs SINEX or GSI F5 PPP-RTK absolute check

Both scripts share the same reference-parsing helpers and pass/fail logic (imported from compare_pos_abs).


Tier 2 — Absolute Accuracy Tests

Reference coordinate sources

Source Format Precision When to use
IGS SINEX .SNX or .SNX.gz ~0.5–2 mm formal σ IGS network stations (e.g., MIZU)
GSI F5 Daily ECEF + geodetic ~5–10 mm scatter GEONET stations in Japan

IGS SINEX

Parsed from the +SOLUTION/ESTIMATE block: - STAX / STAY / STAZ — position in metres at reference epoch - VELX / VELY / VELZ — velocity in m/yr (if present; used for propagation) - Reference epoch encoded as YY:DOY:SOD

Formal reference precision = σ₃D = √(σ_X² + σ_Y² + σ_Z²).

Epoch propagation (optional --epoch YYYY/MM/DD):

Text Only
pos(t) = pos(t₀) + vel · (t − t₀)      [t in years]

GSI F5

Daily coordinate file in ITRF2014/GRS80 with noon UTC positions.

15-day median for evaluation date d:

Text Only
window = rows with |date − d| ≤ 7 days   (up to 15 rows)
true_X = median(window_X)
true_Y = median(window_Y)
true_Z = median(window_Z)

Reference precision = 68th-percentile of the daily 3D scatter within the window relative to the median.

Algorithm

Step 1 — Parse reference coordinate

Text Only
true_xyz = SINEX(station, epoch)   or   F5_median(date ± 7 days)

Step 2 — Per-epoch absolute error

The test file is parsed epoch-by-epoch. For each epoch:

Text Only
test_xyz = blh2xyz(test_lat, test_lon, test_h)   # WGS84 → ECEF
dx       = test_xyz − true_xyz
enu      = xyz2enu(dx, true_lat, true_lon)        # → local ENU [m]

Unlike Tier 1, the ENU origin is fixed at the single true coordinate.

Step 3 — Error distribution

Text Only
2D horizontal error = sqrt(E² + N²)   per epoch
3D error            = sqrt(E² + N² + U²)

1σ  (68th percentile)
95% (95th percentile)
RMS, mean, max

Step 4 — Pass/Fail

Each metric is evaluated independently:

Text Only
PASS if: metric < tolerance   OR   metric < ref_precision

A test passes when both 1σ and 95% criteria pass.

NMEA height recovery

NMEA GGA contains two height fields:

Field Index Content
MSL altitude 9 Orthometric height above geoid [m]
Geoid separation 11 Undulation N from embedded geoid model [m]

Ellipsoidal height is recovered as h_ell = field[9] + field[11]. MRTKLIB's outnmea_gga() always populates both fields via geoidh(), so 3D comparison is fully valid for MRTKLIB-generated GGA files.

If field[11] is absent or zero (some third-party receivers omit it), the script emits a warning and 2D horizontal comparison is more reliable in that case. Use --use-3d to evaluate pass/fail on 3D error (default: 2D horizontal).

Tier 2 CTest entries

Test Reference Tolerance Metric Notes
madocalib_pppar_abs_check IGS SINEX MIZU (week 2383) 0.100 m 2D horiz skip-epochs=60, --use-2d
claslib_ppp_rtk_2ch_abs_check GSI F5 TSUKUBA3 2025/06/06 0.300 m 2D horiz ±7-day median; 88% fix rate

Reference data files: - tests/data/madocalib/IGS0OPSSNX_20252500000_07D_07D_SOL.SNX.gz - tests/data/claslib/960627.25.pos


Tier 3 — Precision (Position Scatter)

Algorithm

No external reference coordinate is required. The session centroid is computed from all accepted epochs, and each epoch's deviation from the centroid is projected into local ENU.

Step 1 — Parse and filter

Text Only
rows = parse_pos(file) | parse_nmea(file)
rows = rows[skip_epochs:]          # discard convergence transient
if fix_only:
    rows = [r for r in rows if r.Q in valid_fix_qs]

Step 2 — ECEF centroid and ENU deviations

Text Only
ecef      = [blh2xyz(lat, lon, h) for each row]
centroid  = mean(ecef)
enu_devs  = [xyz2enu(e − centroid, c_lat, c_lon) for each e]

Step 3 — Scatter statistics

Text Only
horiz = sqrt(E² + N²)    per epoch
1σ    = 68th percentile of horiz
95%   = 95th percentile of horiz

Step 4 — Pass/Fail

Text Only
PASS if: 1σ < tolerance  AND  95% < tolerance

Tier 3 CTest entries

Test Input Tolerance skip-epochs Notes
madocalib_ppp_scatter out_madocalib_ppp.pos 0.150 m 30 MIZU, MADOCA-PPP; 1σ=7.4 cm 95%=11.2 cm
claslib_ppp_rtk_2ch_scatter out_claslib_ppp_rtk_2ch.nmea 0.400 m 20 0627, CLAS 2CH; 1σ=5.8 cm 95%=25.2 cm

The claslib_ppp_rtk_2ch_scatter tolerance is generous because the 2CH dataset achieves ~88% fix rate; the remaining float epochs produce ~25 cm 95th-percentile scatter even after filtering with --fix-only.

Script: scripts/tests/check_pos_scatter.py

Supported formats: .pos (RTKLIB, Q=1/6 for fix-only) and .nmea (GGA, Q=1/4 for fix-only).


What This Does NOT Measure

Aspect Status How to measure instead
Absolute position accuracy vs surveyed truth Tier 2 (partial) madocalib_pppar_abs_check, claslib_ppp_rtk_2ch_abs_check
Real-time latency and throughput Not tested rtkrcv_rt checks line count only
Receiver hardware diversity Not tested Run against data from different receiver types
Long-term stability (days/weeks) Not tested Extend test data time span

Tier 1 validates algorithmic equivalence to upstream as a proxy for correctness. Tier 2 validates geodetic accuracy against independent ground-truth coordinates for selected test sites. Tier 3 validates precision (repeatability) by measuring solution scatter around the session centroid, without requiring any external reference.