MRTKLIB Test Accuracy Methodology¶
Overview¶
MRTKLIB has three complementary test tiers:
| Tier | What is measured | Pass criterion |
|---|---|---|
| Tier 1 — Relative (porting correctness) | MRTKLIB output vs upstream output (same input) | 3D RMS < tolerance; fix-rate delta ≥ threshold |
| Tier 2 — Absolute (geodetic accuracy) | MRTKLIB output vs surveyed ground truth (SINEX / GSI F5) | 1σ and 95% < tolerance or < ref precision |
| Tier 3 — Precision (position scatter) | Spread of solutions around the session centroid | 1σ and 95% < tolerance (no external reference) |
Tier 1: MRTKLIB output vs upstream output (porting correctness)
Tier 2: MRTKLIB output vs known station coord (geodetic accuracy)
Tier 3: MRTKLIB output vs session centroid (precision / scatter)
Reference File Generation¶
All reference files are pre-computed from upstream tools and committed to the repository. They are not regenerated at test time.
| Test group | Reference files | Generated by |
|---|---|---|
| SPP / PPP / PPP-AR | tests/data/madocalib/*.pos | upstream MADOCALIB (LAPACK build) |
| PPP-RTK / VRS-RTK | tests/data/claslib/ref_*.nmea | upstream claslib |
| SPP (MALIB) | tests/data/malib/*.pos | upstream MALIB |
MADOCALIB PPP-AR reference¶
The PPP-AR and PPP-AR+iono reference .pos files (pppar.pos, pppar_ion.pos) are generated from upstream MADOCALIB built with -DLAPACK -framework Accelerate (macOS Accelerate framework), matching the solver used by MRTKLIB. See release-notes-v0.3.1.md for background.
claslib PPP-RTK / VRS-RTK reference¶
Reference files contain NMEA GGA sentences only. Regenerate with:
Comparison Method¶
Step 1 — Time-key matching¶
Both files (reference and MRTKLIB output) are parsed into dictionaries keyed by GPS time string. Only common epochs (epochs present in both files) are used.
ref_data = { "2025/04/01 00:00:00.000" : (lat, lon, h, Q), ... }
test_data = { "2025/04/01 00:00:00.000" : (lat, lon, h, Q), ... }
common = sorted(ref_data.keys() & test_data.keys())
Step 2 — Per-epoch ENU error¶
For each common epoch, the two coordinates are differenced in ECEF and projected into the local ENU frame at the reference position:
ref_xyz = blh2xyz(ref_lat, ref_lon, ref_h) # WGS84 → ECEF
test_xyz = blh2xyz(test_lat, test_lon, test_h)
dx = test_xyz − ref_xyz # ECEF difference
enu = xyz2enu(dx, ref_lat, ref_lon) # → local ENU [m]
The ENU origin shifts epoch-by-epoch with the reference position.
Step 3 — 3D RMS and fix-rate delta¶
3D error per epoch = ||enu||
3D RMS = sqrt( mean( 3D_error² ) )
fix rate = fraction of epochs with Q = 1 (fix) or Q = 6 (PPP)
fix_delta = test_fix_rate − ref_fix_rate
Step 4 — Pass/Fail criteria¶
A test passes when both of the following hold:
| Criterion | Threshold |
|---|---|
3D RMS < tolerance | see table below |
fix_delta ≥ −X% | −1.0% for PPP/PPP-AR, −5.0% for PPP-RTK/VRS-RTK |
Tolerance Values¶
Tolerances encode the expected residual difference between MRTKLIB and upstream after accounting for known numerical divergence sources.
| Test | Tolerance | Actual RMS | Margin | Notes |
|---|---|---|---|---|
madocalib_ppp_check | 0.005 m | < 0.5 cm | ~50% | Deterministic |
madocalib_pppar_check (LAPACK) | 0.008 m | 0.41 cm | ~50% | pppiono_t heap vs embed |
madocalib_pppar_check (no LAPACK) | 0.020 m | ~1.5 cm | ~25% | LU vs LAPACK divergence |
madocalib_pppar_ion_check (LAPACK) | 0.005 m | 0.25 cm | ~50% | Same root cause |
madocalib_pppar_ion_check (no LAPACK) | 0.040 m | ~3.8 cm | ~5% | LU vs LAPACK divergence |
claslib_ppp_rtk_check | 0.10 m | 5.9 cm | ~40% | RTK convergence variability |
claslib_vrs_rtk_check | 0.10 m | 3.3 cm | ~67% | RTK convergence variability |
claslib_ppp_rtk_st12_check | 0.15 m | 10.8 cm | ~28% | Fewer ST12 messages |
| dual-channel tests | 0.20 m | — | — | Float-only, fix rate skipped |
LAPACK conditional: when CMake detects LAPACK_FOUND=FALSE, the madocalib_pppar_*_check tests fall back to the wider (no-LAPACK) tolerances automatically. See CMakeLists.txt lines near _PPPAR_TOL.
Comparison Scripts¶
Tier 1 — Relative scripts¶
| Script | Input format | Used for |
|---|---|---|
scripts/tests/compare_pos.py | RTKLIB .pos (lat/lon/h/Q per epoch) | PPP, PPP-AR |
scripts/tests/compare_nmea.py | NMEA GGA sentences | PPP-RTK, VRS-RTK |
Both scripts implement the same algorithm (time-key matching → ENU error → 3D RMS + fix-rate delta) and share the same pass/fail logic. The only difference is the input parser.
Tier 2 — Absolute scripts¶
| Script | Input format | Used for |
|---|---|---|
scripts/tests/compare_pos_abs.py | RTKLIB .pos vs SINEX or GSI F5 | PPP-AR absolute check |
scripts/tests/compare_nmea_abs.py | NMEA GGA vs SINEX or GSI F5 | PPP-RTK absolute check |
Both scripts share the same reference-parsing helpers and pass/fail logic (imported from compare_pos_abs).
Tier 2 — Absolute Accuracy Tests¶
Reference coordinate sources¶
| Source | Format | Precision | When to use |
|---|---|---|---|
| IGS SINEX | .SNX or .SNX.gz | ~0.5–2 mm formal σ | IGS network stations (e.g., MIZU) |
| GSI F5 | Daily ECEF + geodetic | ~5–10 mm scatter | GEONET stations in Japan |
IGS SINEX¶
Parsed from the +SOLUTION/ESTIMATE block: - STAX / STAY / STAZ — position in metres at reference epoch - VELX / VELY / VELZ — velocity in m/yr (if present; used for propagation) - Reference epoch encoded as YY:DOY:SOD
Formal reference precision = σ₃D = √(σ_X² + σ_Y² + σ_Z²).
Epoch propagation (optional --epoch YYYY/MM/DD):
GSI F5¶
Daily coordinate file in ITRF2014/GRS80 with noon UTC positions.
15-day median for evaluation date d:
window = rows with |date − d| ≤ 7 days (up to 15 rows)
true_X = median(window_X)
true_Y = median(window_Y)
true_Z = median(window_Z)
Reference precision = 68th-percentile of the daily 3D scatter within the window relative to the median.
Algorithm¶
Step 1 — Parse reference coordinate¶
Step 2 — Per-epoch absolute error¶
The test file is parsed epoch-by-epoch. For each epoch:
test_xyz = blh2xyz(test_lat, test_lon, test_h) # WGS84 → ECEF
dx = test_xyz − true_xyz
enu = xyz2enu(dx, true_lat, true_lon) # → local ENU [m]
Unlike Tier 1, the ENU origin is fixed at the single true coordinate.
Step 3 — Error distribution¶
2D horizontal error = sqrt(E² + N²) per epoch
3D error = sqrt(E² + N² + U²)
1σ (68th percentile)
95% (95th percentile)
RMS, mean, max
Step 4 — Pass/Fail¶
Each metric is evaluated independently:
A test passes when both 1σ and 95% criteria pass.
NMEA height recovery¶
NMEA GGA contains two height fields:
| Field | Index | Content |
|---|---|---|
| MSL altitude | 9 | Orthometric height above geoid [m] |
| Geoid separation | 11 | Undulation N from embedded geoid model [m] |
Ellipsoidal height is recovered as h_ell = field[9] + field[11]. MRTKLIB's outnmea_gga() always populates both fields via geoidh(), so 3D comparison is fully valid for MRTKLIB-generated GGA files.
If field[11] is absent or zero (some third-party receivers omit it), the script emits a warning and 2D horizontal comparison is more reliable in that case. Use --use-3d to evaluate pass/fail on 3D error (default: 2D horizontal).
Tier 2 CTest entries¶
| Test | Reference | Tolerance | Metric | Notes |
|---|---|---|---|---|
madocalib_pppar_abs_check | IGS SINEX MIZU (week 2383) | 0.100 m | 2D horiz | skip-epochs=60, --use-2d |
claslib_ppp_rtk_2ch_abs_check | GSI F5 TSUKUBA3 2025/06/06 | 0.300 m | 2D horiz | ±7-day median; 88% fix rate |
Reference data files: - tests/data/madocalib/IGS0OPSSNX_20252500000_07D_07D_SOL.SNX.gz - tests/data/claslib/960627.25.pos
Tier 3 — Precision (Position Scatter)¶
Algorithm¶
No external reference coordinate is required. The session centroid is computed from all accepted epochs, and each epoch's deviation from the centroid is projected into local ENU.
Step 1 — Parse and filter¶
rows = parse_pos(file) | parse_nmea(file)
rows = rows[skip_epochs:] # discard convergence transient
if fix_only:
rows = [r for r in rows if r.Q in valid_fix_qs]
Step 2 — ECEF centroid and ENU deviations¶
ecef = [blh2xyz(lat, lon, h) for each row]
centroid = mean(ecef)
enu_devs = [xyz2enu(e − centroid, c_lat, c_lon) for each e]
Step 3 — Scatter statistics¶
horiz = sqrt(E² + N²) per epoch
1σ = 68th percentile of horiz
95% = 95th percentile of horiz
Step 4 — Pass/Fail¶
Tier 3 CTest entries¶
| Test | Input | Tolerance | skip-epochs | Notes |
|---|---|---|---|---|
madocalib_ppp_scatter | out_madocalib_ppp.pos | 0.150 m | 30 | MIZU, MADOCA-PPP; 1σ=7.4 cm 95%=11.2 cm |
claslib_ppp_rtk_2ch_scatter | out_claslib_ppp_rtk_2ch.nmea | 0.400 m | 20 | 0627, CLAS 2CH; 1σ=5.8 cm 95%=25.2 cm |
The claslib_ppp_rtk_2ch_scatter tolerance is generous because the 2CH dataset achieves ~88% fix rate; the remaining float epochs produce ~25 cm 95th-percentile scatter even after filtering with --fix-only.
Script: scripts/tests/check_pos_scatter.py
Supported formats: .pos (RTKLIB, Q=1/6 for fix-only) and .nmea (GGA, Q=1/4 for fix-only).
What This Does NOT Measure¶
| Aspect | Status | How to measure instead |
|---|---|---|
| Absolute position accuracy vs surveyed truth | Tier 2 (partial) | madocalib_pppar_abs_check, claslib_ppp_rtk_2ch_abs_check |
| Real-time latency and throughput | Not tested | rtkrcv_rt checks line count only |
| Receiver hardware diversity | Not tested | Run against data from different receiver types |
| Long-term stability (days/weeks) | Not tested | Extend test data time span |
Tier 1 validates algorithmic equivalence to upstream as a proxy for correctness. Tier 2 validates geodetic accuracy against independent ground-truth coordinates for selected test sites. Tier 3 validates precision (repeatability) by measuring solution scatter around the session centroid, without requiring any external reference.