Smartphone SPP Benchmark (GSDC-2023)¶

This benchmark evaluates MRTKLIB's single-point positioning (SPP) on low-cost smartphone GNSS data using the open Google Smartphone Decimeter Challenge 2023 (GSDC-2023) dataset.

It is the smartphone counterpart of the geodetic-grade PPC kinematic benchmark. Smartphone data is where the #116 SPP accuracy work's last two stages are validated — the geodetic PPC set cannot exercise them (see docs/design/spp-accuracy.md §4.6):

P5 — common-mode clock-jump correction. Geodetic clocks don't step; smartphones do (the GSDC device_gnss.csv carries HardwareClockDiscontinuityCount).
P6 — position EKF (smoothing / coasting). After P1–P4 the geodetic per-epoch WLS is already smooth, so there is nothing to smooth; smartphone data has large jitter and a real smoothing/coasting payoff.

Note: Like the PPC benchmark, this is intentionally excluded from the regular CTest suite — it needs a large external dataset and network access for broadcast ephemeris. Run it on demand.

Disclaimer: The SPP configuration here is not tuned for the smartphone regime. These are baseline/reference numbers, not best-achievable accuracy.

Dataset¶


Source	Google Smartphone Decimeter Challenge 2023 (Kaggle)
Receivers	Android phones (Pixel 4/4XL/5/6Pro/7Pro, Xiaomi Mi8, Samsung S20/A32, …)
Rate	1 Hz
Reference	NovAtel SPAN (post-processed), `ground_truth.csv`
Train cases	156 `trip/device` combinations with ground truth
Raw format	`device_gnss.csv` (Google-derived) — not RINEX

The raw measurements are not RINEX. GSDC ships a derived CSV in which Google has already reconstructed the pseudorange, classified the signal and provided carrier phase / Doppler / C/N0, so a direct CSV → RINEX mapping (gsdc_to_rinex.py) is simpler and less error-prone than re-deriving the pseudorange from the raw Android clock fields.

Manual download¶

The archive (~3.5 GB zip → ~16 GB extracted) must be downloaded manually from Kaggle (account + competition rules acceptance required):

Visit the GSDC-2023 data page and download the competition data (or use the Kaggle CLI: kaggle competitions download -c smartphone-decimeter-2023).
Place the zip at tests/data/sdc2023.zip (git-ignored), or extract it so the layout matches:

Text Only

data/gsdc/
  train/
    2020-06-25-00-34-us-ca-mtv-sb-101/
      pixel4/
        device_gnss.csv      # raw (derived) measurements
        ground_truth.csv     # reference track (lat/lon/alt + speed/bearing)
    ...
  brdc/                      # broadcast nav cache (auto-downloaded)
  results/                  # OBS + NMEA output

Only device_gnss.csv and ground_truth.csv are needed per case; the device_imu.csv and supplemental/ files are not used. Both data/gsdc/ and tests/data/sdc2023.zip are git-ignored.

Extract just the curated subset (≈250 MB) without unpacking the whole archive:

Bash

for tp in \
  train/2020-06-25-00-34-us-ca-mtv-sb-101/pixel4 \
  train/2020-07-17-23-13-us-ca-sf-mtv-280/pixel4 \
  train/2020-12-10-22-52-us-ca-sjc-c/mi8 \
  train/2021-04-02-20-43-us-ca-mtv-f/sm-g988b \
  train/2021-12-07-19-22-us-ca-lax-d/pixel5 \
  train/2023-09-05-23-07-us-ca-routen/pixel7pro ; do
  unzip -o -q tests/data/sdc2023.zip "$tp/device_gnss.csv" "$tp/ground_truth.csv" -d data/gsdc/
done

Conversion: `device_gnss.csv` → RINEX 3.04¶

gsdc_to_rinex.py maps each measurement row to a RINEX 3.04 observation:

RINEX	Source column	Notes
C (pseudorange)	`RawPseudorangeMeters`	already reconstructed by Google
L (carrier)	`AccumulatedDeltaRangeMeters` / λ	only when ADR state = VALID; LLI set on RESET/CYCLE_SLIP
D (Doppler)	−`PseudorangeRateMetersPerSecond` / λ
S (C/N0)	`Cn0DbHz`	full dB-Hz in the S obs (e.g. `S1C` = `40.200`); a 1–9 SSI digit is also appended to every field per RINEX

with λ = c / CarrierFrequencyHz. Satellite id is (ConstellationType, Svid) → G/R/E/C/J (QZSS PRN = Svid − 192). Epoch GPS time = utcTimeMillis/1000 − 315964800 + LeapSecond (cross-checked against the CSV's own ArrivalTimeNanosSinceGpsEpoch).

Signal → RINEX 3.04 observation code:

Android `SignalType`	RINEX code	Android `SignalType`	RINEX code
`GPS_L1_CA`	`1C`	`GAL_E5A_Q`	`5Q`
`GPS_L5_Q`	`5Q`	`GLO_G1_CA`	`1C`
`GAL_E1_C_P`	`1C`	`QZS_L1_CA`	`1C`
`BDS_B1_I`	`2I`	`QZS_L5_Q`	`5Q`

GLONASS is converted but excluded from positioning by the SPP config's systems list (FDMA, no broadcast iono model fit for single-frequency SPP).

Converter self-check. --self-check reproduces Google's per-epoch WLS from the CSV's own SvPosition*/clock/iono/tropo columns and compares it to the WlsPosition* column. Agreement to a few metres confirms the pseudorange / satellite-id / epoch-grouping reading (a reading bug diverges by orders of magnitude, not metres):

Bash

python scripts/benchmark/gsdc_to_rinex.py \
    data/gsdc/train/.../pixel4/device_gnss.csv --self-check

Broadcast ephemeris¶

device_gnss.csv carries no ephemeris, so download_brdc.py fetches the daily merged multi-GNSS broadcast nav (RINEX 3, GPS+GLO+GAL+BDS+QZS) for each UTC day a trip spans. The source is BKG (BRDM00DLR_S, falling back to BRDC00WRD_S / BRDC00IGS_R), which serves without authentication (CDDIS requires an Earthdata login). Files are cached under data/gsdc/brdc/.

Quick start¶

Bash

# 1. Build mrtk and extract the test ATX/ISB/ERP data (as for any benchmark).
cmake --build build
cd build && ctest -R setup --output-on-failure && cd ..

# 2. Run the curated subset (downloads broadcast nav automatically).
python scripts/benchmark/run_gsdc_benchmark.py --case curated

# 3. Single case, or every discovered train case:
python scripts/benchmark/run_gsdc_benchmark.py \
    --case 2021-12-07-19-22-us-ca-lax-d/pixel5
python scripts/benchmark/run_gsdc_benchmark.py --case all

Option	Default	Description
`--dataset-dir DIR`	`data/gsdc`	GSDC root (contains `train/`)
`--brdc-dir DIR`	`data/gsdc/brdc`	broadcast-nav cache
`--out-dir DIR`	`data/gsdc/results`	OBS + NMEA output
`--case curated\\|all\\|<id,…>`	`curated`	which cases to run
`--conf PATH` (repeatable)	`conf/benchmark/single.toml`	layered config
`--skip-download`	off	use cached nav only
`--threshold M`	`2.0`	2D `<N m` rate threshold
`--force`	off	re-convert even if cached

Scoring (official GSDC metric)¶

The Google Smartphone Decimeter Challenge scores each run by the mean of the 50th and 95th percentile horizontal positioning error (metres); the per-run scores are averaged to give the leaderboard number. The official competition computes this over the 36 test runs (public/private split), whose ground truth is withheld.

This benchmark reports the same metric (score column = (p50 + p95) / 2) but on the train split, which has public ground truth, so the numbers are directly interpretable on the leaderboard's scale. Top GSDC entries reach ~1–2 m using carrier-phase PPP/RTK + IMU + map-matching; the SPP-only figures here are a code-pseudorange baseline, not a leaderboard-competitive solution.

The runner also prints <2 m rate, RMS and matched-epoch count N (availability) for engineering insight. Per-epoch ENU error is taken against the matched ground-truth coordinate (moving-base projection), exactly as in the PPC benchmark; reference and NMEA GGA epochs are matched by UTC seconds-of-day within 0.15 s.

Baseline results (v0.6.10, curated subset)¶

GNSS-only SPP, broadcast ephemeris, all matched epochs (no skip). Two configurations bracket the #116 work and form the before/after reference for the P5/P6 effort:

P0 — classic SPP: elevation-only weighting, Klobuchar iono, Saastamoinen tropo, RAIM-FDE (single.toml + gsdc_p0.toml).
P1–P4 — single.toml as shipped: + C/N0 weighting, IGG-III robust + pre-robust gate, TDCP jump rejection.

score is the official GSDC metric; lower is better.

P0 (classic SPP)¶

Case	N	nSV	<2 m	RMS 2D	p50	p95	score
mtv-sb-101 / pixel4	1289	14.3	21.5%	4.30 m	3.34 m	7.61 m	5.47 m
sf-mtv-280 / pixel4	1680	11.6	20.7%	4.98 m	3.68 m	8.94 m	6.31 m
sjc-c / mi8	1266	15.0	24.7%	4.71 m	3.16 m	8.72 m	5.94 m
mtv-f / sm-g988b	2284	19.9	37.0%	3.35 m	2.58 m	6.03 m	4.30 m
lax-d / pixel5	1716	12.4	22.1%	4.30 m	3.37 m	7.56 m	5.47 m
routen / pixel7pro	1545	14.7	22.7%	4.58 m	3.41 m	8.68 m	6.05 m
mean	1630	14.6	24.8%	4.37 m	3.26 m	7.92 m	5.59 m

P1–P4 (`single.toml`)¶

Case	N	nSV	<2 m	RMS 2D	p50	p95	score
mtv-sb-101 / pixel4	685	14.4	36.8%	3.00 m	2.41 m	5.45 m	3.93 m
sf-mtv-280 / pixel4	370	11.8	34.3%	3.12 m	2.63 m	5.47 m	4.05 m
sjc-c / mi8	815	15.1	36.0%	3.62 m	2.51 m	6.88 m	4.69 m
mtv-f / sm-g988b	2072	19.9	42.2%	2.93 m	2.33 m	5.26 m	3.80 m
lax-d / pixel5	735	12.5	37.6%	3.13 m	2.42 m	5.25 m	3.84 m
routen / pixel7pro	768	14.8	28.6%	3.86 m	3.03 m	6.64 m	4.84 m
mean	907	14.7	35.9%	3.28 m	2.56 m	5.82 m	4.19 m

Reading the two tables¶

P1–P4 improves the GSDC score from 5.59 m → 4.19 m (−25 %) (p50 3.26 → 2.56 m, p95 7.92 → 5.82 m, <2 m +11 pp) but the matched-epoch count drops ~44 % (mean 1630 → 907): the QC rejects the noisy/blunder epochs rather than solving them poorly. This availability/accuracy trade-off is exactly the gap the remaining work targets:

P6 (position EKF) should coast through the rejected epochs, recovering availability without giving back accuracy — and smartphone jitter (3–4 m, vs the geodetic set's sub-decimetre) is where smoothing actually pays (spp-accuracy.md §4.6).
P5 (clock-jump correction) should recover epochs lost to receiver clock steps and stabilise TDCP on low-cost hardware.

Google's own WLS (the WlsPosition* column) lands at ~2.3 m 2D RMS on the open trips, so MRTKLIB's untuned SPP is already in the same regime; the headroom is in availability and the urban tail.

Known limitations¶

Untuned SPP config. single.toml was tuned on the geodetic PPC set; the smartphone regime (clock jumps, larger noise, NLOS) is not yet tuned — that is the P5/P6 work.
No antenna / ISB calibration. Phone antennas have no PCV; Google's satellite ISB is not applied (the converter keeps RawPseudorangeMeters).
GLONASS excluded from positioning (converted but dropped by the config).
Train split only. The test/ split has no public ground truth.
IMU not used. GNSS-only.

Acknowledgements¶

The dataset is the Google Smartphone Decimeter Challenge 2023, hosted on Kaggle and released by Google's Android GNSS team in collaboration with the ION GNSS+ conference. Please observe the Kaggle competition rules and cite the GSDC-2023 materials when publishing results derived from this data.

Smartphone SPP Benchmark (GSDC-2023)¶

Dataset¶

Manual download¶

Conversion: device_gnss.csv → RINEX 3.04¶

Broadcast ephemeris¶

Quick start¶

Scoring (official GSDC metric)¶

Baseline results (v0.6.10, curated subset)¶

P0 (classic SPP)¶

P1–P4 (single.toml)¶

Reading the two tables¶

Known limitations¶

Acknowledgements¶

References¶

Conversion: `device_gnss.csv` → RINEX 3.04¶

P1–P4 (`single.toml`)¶