Skip to content

Smartphone SPP Benchmark (GSDC-2023)

This benchmark evaluates MRTKLIB's single-point positioning (SPP) on low-cost smartphone GNSS data using the open Google Smartphone Decimeter Challenge 2023 (GSDC-2023) dataset.

It is the smartphone counterpart of the geodetic-grade PPC kinematic benchmark. Smartphone data is where the #116 SPP accuracy work's last two stages are validated — the geodetic PPC set cannot exercise them (see docs/design/spp-accuracy.md §4.6):

  • P5 — common-mode clock-jump correction. Geodetic clocks don't step; smartphones do (the GSDC device_gnss.csv carries HardwareClockDiscontinuityCount).
  • P6 — position EKF (smoothing / coasting). After P1–P4 the geodetic per-epoch WLS is already smooth, so there is nothing to smooth; smartphone data has large jitter and a real smoothing/coasting payoff.

Note: Like the PPC benchmark, this is intentionally excluded from the regular CTest suite — it needs a large external dataset and network access for broadcast ephemeris. Run it on demand.

Disclaimer: The SPP configuration here is not tuned for the smartphone regime. These are baseline/reference numbers, not best-achievable accuracy.


Dataset

Source Google Smartphone Decimeter Challenge 2023 (Kaggle)
Receivers Android phones (Pixel 4/4XL/5/6Pro/7Pro, Xiaomi Mi8, Samsung S20/A32, …)
Rate 1 Hz
Reference NovAtel SPAN (post-processed), ground_truth.csv
Train cases 156 trip/device combinations with ground truth
Raw format device_gnss.csv (Google-derived) — not RINEX

The raw measurements are not RINEX. GSDC ships a derived CSV in which Google has already reconstructed the pseudorange, classified the signal and provided carrier phase / Doppler / C/N0, so a direct CSV → RINEX mapping (gsdc_to_rinex.py) is simpler and less error-prone than re-deriving the pseudorange from the raw Android clock fields.

Manual download

The archive (~3.5 GB zip → ~16 GB extracted) must be downloaded manually from Kaggle (account + competition rules acceptance required):

  1. Visit the GSDC-2023 data page and download the competition data (or use the Kaggle CLI: kaggle competitions download -c smartphone-decimeter-2023).
  2. Place the zip at tests/data/sdc2023.zip (git-ignored), or extract it so the layout matches:
Text Only
data/gsdc/
  train/
    2020-06-25-00-34-us-ca-mtv-sb-101/
      pixel4/
        device_gnss.csv      # raw (derived) measurements
        ground_truth.csv     # reference track (lat/lon/alt + speed/bearing)
    ...
  brdc/                      # broadcast nav cache (auto-downloaded)
  results/                  # OBS + NMEA output

Only device_gnss.csv and ground_truth.csv are needed per case; the device_imu.csv and supplemental/ files are not used. Both data/gsdc/ and tests/data/sdc2023.zip are git-ignored.

Extract just the curated subset (≈250 MB) without unpacking the whole archive:

Bash
for tp in \
  train/2020-06-25-00-34-us-ca-mtv-sb-101/pixel4 \
  train/2020-07-17-23-13-us-ca-sf-mtv-280/pixel4 \
  train/2020-12-10-22-52-us-ca-sjc-c/mi8 \
  train/2021-04-02-20-43-us-ca-mtv-f/sm-g988b \
  train/2021-12-07-19-22-us-ca-lax-d/pixel5 \
  train/2023-09-05-23-07-us-ca-routen/pixel7pro ; do
  unzip -o -q tests/data/sdc2023.zip "$tp/device_gnss.csv" "$tp/ground_truth.csv" -d data/gsdc/
done

Conversion: device_gnss.csv → RINEX 3.04

gsdc_to_rinex.py maps each measurement row to a RINEX 3.04 observation:

RINEX Source column Notes
C (pseudorange) RawPseudorangeMeters already reconstructed by Google
L (carrier) AccumulatedDeltaRangeMeters / λ only when ADR state = VALID; LLI set on RESET/CYCLE_SLIP
D (Doppler) PseudorangeRateMetersPerSecond / λ
S (C/N0) Cn0DbHz full dB-Hz in the S obs (e.g. S1C = 40.200); a 1–9 SSI digit is also appended to every field per RINEX

with λ = c / CarrierFrequencyHz. Satellite id is (ConstellationType, Svid)G/R/E/C/J (QZSS PRN = Svid − 192). Epoch GPS time = utcTimeMillis/1000 − 315964800 + LeapSecond (cross-checked against the CSV's own ArrivalTimeNanosSinceGpsEpoch).

Signal → RINEX 3.04 observation code:

Android SignalType RINEX code Android SignalType RINEX code
GPS_L1_CA 1C GAL_E5A_Q 5Q
GPS_L5_Q 5Q GLO_G1_CA 1C
GAL_E1_C_P 1C QZS_L1_CA 1C
BDS_B1_I 2I QZS_L5_Q 5Q

GLONASS is converted but excluded from positioning by the SPP config's systems list (FDMA, no broadcast iono model fit for single-frequency SPP).

Converter self-check. --self-check reproduces Google's per-epoch WLS from the CSV's own SvPosition*/clock/iono/tropo columns and compares it to the WlsPosition* column. Agreement to a few metres confirms the pseudorange / satellite-id / epoch-grouping reading (a reading bug diverges by orders of magnitude, not metres):

Bash
python scripts/benchmark/gsdc_to_rinex.py \
    data/gsdc/train/.../pixel4/device_gnss.csv --self-check

Broadcast ephemeris

device_gnss.csv carries no ephemeris, so download_brdc.py fetches the daily merged multi-GNSS broadcast nav (RINEX 3, GPS+GLO+GAL+BDS+QZS) for each UTC day a trip spans. The source is BKG (BRDM00DLR_S, falling back to BRDC00WRD_S / BRDC00IGS_R), which serves without authentication (CDDIS requires an Earthdata login). Files are cached under data/gsdc/brdc/.


Quick start

Bash
# 1. Build mrtk and extract the test ATX/ISB/ERP data (as for any benchmark).
cmake --build build
cd build && ctest -R setup --output-on-failure && cd ..

# 2. Run the curated subset (downloads broadcast nav automatically).
python scripts/benchmark/run_gsdc_benchmark.py --case curated

# 3. Single case, or every discovered train case:
python scripts/benchmark/run_gsdc_benchmark.py \
    --case 2021-12-07-19-22-us-ca-lax-d/pixel5
python scripts/benchmark/run_gsdc_benchmark.py --case all
Option Default Description
--dataset-dir DIR data/gsdc GSDC root (contains train/)
--brdc-dir DIR data/gsdc/brdc broadcast-nav cache
--out-dir DIR data/gsdc/results OBS + NMEA output
--case curated\|all\|<id,…> curated which cases to run
--conf PATH (repeatable) conf/benchmark/single.toml layered config
--skip-download off use cached nav only
--threshold M 2.0 2D <N m rate threshold
--force off re-convert even if cached

Scoring (official GSDC metric)

The Google Smartphone Decimeter Challenge scores each run by the mean of the 50th and 95th percentile horizontal positioning error (metres); the per-run scores are averaged to give the leaderboard number. The official competition computes this over the 36 test runs (public/private split), whose ground truth is withheld.

This benchmark reports the same metric (score column = (p50 + p95) / 2) but on the train split, which has public ground truth, so the numbers are directly interpretable on the leaderboard's scale. Top GSDC entries reach ~1–2 m using carrier-phase PPP/RTK + IMU + map-matching; the SPP-only figures here are a code-pseudorange baseline, not a leaderboard-competitive solution.

The runner also prints <2 m rate, RMS and matched-epoch count N (availability) for engineering insight. Per-epoch ENU error is taken against the matched ground-truth coordinate (moving-base projection), exactly as in the PPC benchmark; reference and NMEA GGA epochs are matched by UTC seconds-of-day within 0.15 s.


Baseline results (v0.6.10, curated subset)

GNSS-only SPP, broadcast ephemeris, all matched epochs (no skip). Two configurations bracket the #116 work and form the before/after reference for the P5/P6 effort:

  • P0 — classic SPP: elevation-only weighting, Klobuchar iono, Saastamoinen tropo, RAIM-FDE (single.toml + gsdc_p0.toml).
  • P1–P4single.toml as shipped: + C/N0 weighting, IGG-III robust + pre-robust gate, TDCP jump rejection.

score is the official GSDC metric; lower is better.

P0 (classic SPP)

Case N nSV <2 m RMS 2D p50 p95 score
mtv-sb-101 / pixel4 1289 14.3 21.5% 4.30 m 3.34 m 7.61 m 5.47 m
sf-mtv-280 / pixel4 1680 11.6 20.7% 4.98 m 3.68 m 8.94 m 6.31 m
sjc-c / mi8 1266 15.0 24.7% 4.71 m 3.16 m 8.72 m 5.94 m
mtv-f / sm-g988b 2284 19.9 37.0% 3.35 m 2.58 m 6.03 m 4.30 m
lax-d / pixel5 1716 12.4 22.1% 4.30 m 3.37 m 7.56 m 5.47 m
routen / pixel7pro 1545 14.7 22.7% 4.58 m 3.41 m 8.68 m 6.05 m
mean 1630 14.6 24.8% 4.37 m 3.26 m 7.92 m 5.59 m

P1–P4 (single.toml)

Case N nSV <2 m RMS 2D p50 p95 score
mtv-sb-101 / pixel4 685 14.4 36.8% 3.00 m 2.41 m 5.45 m 3.93 m
sf-mtv-280 / pixel4 370 11.8 34.3% 3.12 m 2.63 m 5.47 m 4.05 m
sjc-c / mi8 815 15.1 36.0% 3.62 m 2.51 m 6.88 m 4.69 m
mtv-f / sm-g988b 2072 19.9 42.2% 2.93 m 2.33 m 5.26 m 3.80 m
lax-d / pixel5 735 12.5 37.6% 3.13 m 2.42 m 5.25 m 3.84 m
routen / pixel7pro 768 14.8 28.6% 3.86 m 3.03 m 6.64 m 4.84 m
mean 907 14.7 35.9% 3.28 m 2.56 m 5.82 m 4.19 m

Reading the two tables

P1–P4 improves the GSDC score from 5.59 m → 4.19 m (−25 %) (p50 3.26 → 2.56 m, p95 7.92 → 5.82 m, <2 m +11 pp) but the matched-epoch count drops ~44 % (mean 1630 → 907): the QC rejects the noisy/blunder epochs rather than solving them poorly. This availability/accuracy trade-off is exactly the gap the remaining work targets:

  • P6 (position EKF) should coast through the rejected epochs, recovering availability without giving back accuracy — and smartphone jitter (3–4 m, vs the geodetic set's sub-decimetre) is where smoothing actually pays (spp-accuracy.md §4.6).
  • P5 (clock-jump correction) should recover epochs lost to receiver clock steps and stabilise TDCP on low-cost hardware.

Google's own WLS (the WlsPosition* column) lands at ~2.3 m 2D RMS on the open trips, so MRTKLIB's untuned SPP is already in the same regime; the headroom is in availability and the urban tail.


Known limitations

  • Untuned SPP config. single.toml was tuned on the geodetic PPC set; the smartphone regime (clock jumps, larger noise, NLOS) is not yet tuned — that is the P5/P6 work.
  • No antenna / ISB calibration. Phone antennas have no PCV; Google's satellite ISB is not applied (the converter keeps RawPseudorangeMeters).
  • GLONASS excluded from positioning (converted but dropped by the config).
  • Train split only. The test/ split has no public ground truth.
  • IMU not used. GNSS-only.

Acknowledgements

The dataset is the Google Smartphone Decimeter Challenge 2023, hosted on Kaggle and released by Google's Android GNSS team in collaboration with the ION GNSS+ conference. Please observe the Kaggle competition rules and cite the GSDC-2023 materials when publishing results derived from this data.


References