MRTKLIB Test Accuracy Methodology¶

Overview¶

MRTKLIB has three complementary test tiers:

Tier	What is measured	Pass criterion
Tier 1 — Relative (porting correctness)	MRTKLIB output vs upstream output (same input)	3D RMS < tolerance; fix-rate delta ≥ threshold
Tier 2 — Absolute (geodetic accuracy)	MRTKLIB output vs surveyed ground truth (SINEX / GSI F5)	1σ and 95% < tolerance or < ref precision
Tier 3 — Precision (position scatter)	Spread of solutions around the session centroid	1σ and 95% < tolerance (no external reference)

Text Only

Tier 1: MRTKLIB output  vs  upstream output       (porting correctness)
Tier 2: MRTKLIB output  vs  known station coord   (geodetic accuracy)
Tier 3: MRTKLIB output  vs  session centroid      (precision / scatter)

Reference File Generation¶

All reference files are pre-computed from upstream tools and committed to the repository. They are not regenerated at test time.

Test group	Reference files	Generated by
SPP / PPP / PPP-AR	`tests/data/madocalib/*.pos`	upstream MADOCALIB (LAPACK build)
PPP-RTK / VRS-RTK	`tests/data/claslib/ref_*.nmea`	upstream claslib
SPP (MALIB)	`tests/data/malib/*.pos`	upstream MALIB

MADOCALIB PPP-AR reference¶

The PPP-AR and PPP-AR+iono reference .pos files (pppar.pos, pppar_ion.pos) are generated from upstream MADOCALIB built with -DLAPACK -framework Accelerate (macOS Accelerate framework), matching the solver used by MRTKLIB. See release-notes-v0.3.1.md for background.

claslib PPP-RTK / VRS-RTK reference¶

Reference files contain NMEA GGA sentences only. Regenerate with:

Bash

bash tests/data/claslib/generate_reference.sh

Comparison Method¶

Step 1 — Time-key matching¶

Both files (reference and MRTKLIB output) are parsed into dictionaries keyed by GPS time string. Only common epochs (epochs present in both files) are used.

Text Only

ref_data  = { "2025/04/01 00:00:00.000" : (lat, lon, h, Q), ... }
test_data = { "2025/04/01 00:00:00.000" : (lat, lon, h, Q), ... }
common    = sorted(ref_data.keys() & test_data.keys())

Step 2 — Per-epoch ENU error¶

For each common epoch, the two coordinates are differenced in ECEF and projected into the local ENU frame at the reference position:

Text Only

ref_xyz  = blh2xyz(ref_lat, ref_lon, ref_h)    # WGS84 → ECEF
test_xyz = blh2xyz(test_lat, test_lon, test_h)
dx       = test_xyz − ref_xyz                   # ECEF difference
enu      = xyz2enu(dx, ref_lat, ref_lon)        # → local ENU [m]

The ENU origin shifts epoch-by-epoch with the reference position.

Step 3 — 3D RMS and fix-rate delta¶

Text Only

3D error per epoch = ||enu||
3D RMS             = sqrt( mean( 3D_error² ) )
fix rate           = fraction of epochs with Q = 1 (fix) or Q = 6 (PPP)
fix_delta          = test_fix_rate − ref_fix_rate

Step 4 — Pass/Fail criteria¶

A test passes when both of the following hold:

Criterion	Threshold
`3D RMS < tolerance`	see table below
`fix_delta ≥ −X%`	−1.0% for PPP/PPP-AR, −5.0% for PPP-RTK/VRS-RTK

Tolerance Values¶

Tolerances encode the expected residual difference between MRTKLIB and upstream after accounting for known numerical divergence sources.

Test	Tolerance	Actual RMS	Margin	Notes
`madocalib_ppp_check`	0.005 m	< 0.5 cm	~50%	Deterministic
`madocalib_pppar_check` (LAPACK)	0.008 m	0.41 cm	~50%	`pppiono_t` heap vs embed
`madocalib_pppar_check` (no LAPACK)	0.020 m	~1.5 cm	~25%	LU vs LAPACK divergence
`madocalib_pppar_ion_check` (LAPACK)	0.005 m	0.25 cm	~50%	Same root cause
`madocalib_pppar_ion_check` (no LAPACK)	0.040 m	~3.8 cm	~5%	LU vs LAPACK divergence
`claslib_ppp_rtk_check`	0.10 m	5.9 cm	~40%	RTK convergence variability
`claslib_vrs_rtk_check`	0.10 m	3.3 cm	~67%	RTK convergence variability
`claslib_ppp_rtk_st12_check`	0.15 m	10.8 cm	~28%	Fewer ST12 messages
dual-channel tests	0.20 m	—	—	Float-only, fix rate skipped

LAPACK conditional: when CMake detects LAPACK_FOUND=FALSE, the madocalib_pppar_*_check tests fall back to the wider (no-LAPACK) tolerances automatically. See CMakeLists.txt lines near _PPPAR_TOL.

Comparison Scripts¶

Tier 1 — Relative scripts¶

Script	Input format	Used for
`scripts/tests/compare_pos.py`	RTKLIB `.pos` (lat/lon/h/Q per epoch)	PPP, PPP-AR
`scripts/tests/compare_nmea.py`	NMEA GGA sentences	PPP-RTK, VRS-RTK

Both scripts implement the same algorithm (time-key matching → ENU error → 3D RMS + fix-rate delta) and share the same pass/fail logic. The only difference is the input parser.

Tier 2 — Absolute scripts¶

Script	Input format	Used for
`scripts/tests/compare_pos_abs.py`	RTKLIB `.pos` vs SINEX or GSI F5	PPP-AR absolute check
`scripts/tests/compare_nmea_abs.py`	NMEA GGA vs SINEX or GSI F5	PPP-RTK absolute check

Both scripts share the same reference-parsing helpers and pass/fail logic (imported from compare_pos_abs).

Tier 2 — Absolute Accuracy Tests¶

Reference coordinate sources¶

Source	Format	Precision	When to use
IGS SINEX	`.SNX` or `.SNX.gz`	~0.5–2 mm formal σ	IGS network stations (e.g., MIZU)
GSI F5	Daily ECEF + geodetic	~5–10 mm scatter	GEONET stations in Japan

IGS SINEX¶

Parsed from the +SOLUTION/ESTIMATE block: - STAX / STAY / STAZ — position in metres at reference epoch - VELX / VELY / VELZ — velocity in m/yr (if present; used for propagation) - Reference epoch encoded as YY:DOY:SOD

Formal reference precision = σ₃D = √(σ_X² + σ_Y² + σ_Z²).

Epoch propagation (optional --epoch YYYY/MM/DD):

Text Only

pos(t) = pos(t₀) + vel · (t − t₀)      [t in years]

GSI F5¶

Daily coordinate file in ITRF2014/GRS80 with noon UTC positions.

15-day median for evaluation date d:

Text Only

window = rows with |date − d| ≤ 7 days   (up to 15 rows)
true_X = median(window_X)
true_Y = median(window_Y)
true_Z = median(window_Z)

Reference precision = 68th-percentile of the daily 3D scatter within the window relative to the median.

Algorithm¶

Step 1 — Parse reference coordinate¶

Text Only

true_xyz = SINEX(station, epoch)   or   F5_median(date ± 7 days)

Step 2 — Per-epoch absolute error¶

The test file is parsed epoch-by-epoch. For each epoch:

Text Only

test_xyz = blh2xyz(test_lat, test_lon, test_h)   # WGS84 → ECEF
dx       = test_xyz − true_xyz
enu      = xyz2enu(dx, true_lat, true_lon)        # → local ENU [m]

Unlike Tier 1, the ENU origin is fixed at the single true coordinate.

Step 3 — Error distribution¶

Text Only

2D horizontal error = sqrt(E² + N²)   per epoch
3D error            = sqrt(E² + N² + U²)

1σ  (68th percentile)
95% (95th percentile)
RMS, mean, max

Step 4 — Pass/Fail¶

Each metric is evaluated independently:

Text Only

PASS if: metric < tolerance   OR   metric < ref_precision

A test passes when both 1σ and 95% criteria pass.

NMEA height recovery¶

NMEA GGA contains two height fields:

Field	Index	Content
MSL altitude	9	Orthometric height above geoid [m]
Geoid separation	11	Undulation N from embedded geoid model [m]

Ellipsoidal height is recovered as h_ell = field[9] + field[11]. MRTKLIB's outnmea_gga() always populates both fields via geoidh(), so 3D comparison is fully valid for MRTKLIB-generated GGA files.

If field[11] is absent or zero (some third-party receivers omit it), the script emits a warning and 2D horizontal comparison is more reliable in that case. Use --use-3d to evaluate pass/fail on 3D error (default: 2D horizontal).

Tier 2 CTest entries¶

Test	Reference	Tolerance	Metric	Notes
`madocalib_pppar_abs_check`	IGS SINEX MIZU (week 2383)	0.100 m	2D horiz	skip-epochs=60, --use-2d
`claslib_ppp_rtk_2ch_abs_check`	GSI F5 TSUKUBA3 2025/06/06	0.300 m	2D horiz	±7-day median; 88% fix rate

Reference data files: - tests/data/madocalib/IGS0OPSSNX_20252500000_07D_07D_SOL.SNX.gz - tests/data/claslib/960627.25.pos

Tier 3 — Precision (Position Scatter)¶

Algorithm¶

No external reference coordinate is required. The session centroid is computed from all accepted epochs, and each epoch's deviation from the centroid is projected into local ENU.

Step 1 — Parse and filter¶

Text Only

rows = parse_pos(file) | parse_nmea(file)
rows = rows[skip_epochs:]          # discard convergence transient
if fix_only:
    rows = [r for r in rows if r.Q in valid_fix_qs]

Step 2 — ECEF centroid and ENU deviations¶

Text Only

ecef      = [blh2xyz(lat, lon, h) for each row]
centroid  = mean(ecef)
enu_devs  = [xyz2enu(e − centroid, c_lat, c_lon) for each e]

Step 3 — Scatter statistics¶

Text Only

horiz = sqrt(E² + N²)    per epoch
1σ    = 68th percentile of horiz
95%   = 95th percentile of horiz

Step 4 — Pass/Fail¶

Text Only

PASS if: 1σ < tolerance  AND  95% < tolerance

Tier 3 CTest entries¶

Test	Input	Tolerance	skip-epochs	Notes
`madocalib_ppp_scatter`	`out_madocalib_ppp.pos`	0.150 m	30	MIZU, MADOCA-PPP; 1σ=7.4 cm 95%=11.2 cm
`claslib_ppp_rtk_2ch_scatter`	`out_claslib_ppp_rtk_2ch.nmea`	0.400 m	20	0627, CLAS 2CH; 1σ=5.8 cm 95%=25.2 cm

The claslib_ppp_rtk_2ch_scatter tolerance is generous because the 2CH dataset achieves ~88% fix rate; the remaining float epochs produce ~25 cm 95th-percentile scatter even after filtering with --fix-only.

Script: scripts/tests/check_pos_scatter.py

Supported formats: .pos (RTKLIB, Q=1/6 for fix-only) and .nmea (GGA, Q=1/4 for fix-only).

What This Does NOT Measure¶

Aspect	Status	How to measure instead
Absolute position accuracy vs surveyed truth	Tier 2 (partial)	`madocalib_pppar_abs_check`, `claslib_ppp_rtk_2ch_abs_check`
Real-time latency and throughput	Not tested	`rtkrcv_rt` checks line count only
Receiver hardware diversity	Not tested	Run against data from different receiver types
Long-term stability (days/weeks)	Not tested	Extend test data time span

Tier 1 validates algorithmic equivalence to upstream as a proxy for correctness. Tier 2 validates geodetic accuracy against independent ground-truth coordinates for selected test sites. Tier 3 validates precision (repeatability) by measuring solution scatter around the session centroid, without requiring any external reference.