Umeå University's logo

umu.sePublications
Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Trust-aware routing for distributed generative AI inference at the edge
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Autonomous Distributed Systems Lab)ORCID iD: 0000-0002-9156-3364
Umeå University, Faculty of Science and Technology, Department of Computing Science. (Autonomous Distributed Systems Lab)ORCID iD: 0000-0002-2633-6798
2026 (English)Conference paper, Oral presentation only (Refereed)
Abstract [en]

Emerging deployments of Generative AI increasingly execute inference across decentralized and heterogeneous edge devices rather than on a single trusted server. In such environments, a single device failure or misbehavior can disrupt the entire inference process, making traditional best-effort peer-to-peer routing insufficient. Coordinating distributed generative inference therefore requires mechanisms that explicitly account for reliability, performance variability, and trust among participating peers.

In this paper, we present G-TRAC, a trust-aware coordination framework that integrates algorithmic path selection with system-level protocol design to ensure robust distributed inference.First, we formulate the routing problem as a Risk-Bounded Shortest Path computation and introduce a polynomial-time solution that combines trust-floor pruning with Dijkstra's search, achieving sub-millisecond median routing latency at practical edge scales, and remaining below 10 ms at larger scales.Second, to operationally support the routing logic in dynamic environments, the framework employs a Hybrid Trust Architecture that maintains global reputation state at stable anchors while disseminating lightweight updates to edge peers via background synchronization.

Experimental evaluation on a heterogeneous testbed of commodity devices demonstrates that G-TRAC significantly improves inference completion rates, effectively isolates unreliable peers, and sustains robust execution even under node failures and network partitions.

Place, publisher, year, edition, pages
Reykjavik, Iceland, 2026.
Keywords [en]
Edge Computing, Edge Intelligence, Distributed LLM Inference, Risk-Bounded Systems, Trust-Aware Routing, Pipeline Parallelism
National Category
Computer Sciences Networked, Parallel and Distributed Computing
Research subject
Computer Science
Identifiers
URN: urn:nbn:se:umu:diva-251612OAI: oai:DiVA.org:umu-251612DiVA, id: diva2:2050104
Conference
The 22nd Annual International Conference on Distributed Computing in Smart Systems and the Internet of Things (DCOSS-IoT 2026), Reykjavik, Iceland, June 22-24, 2026
Available from: 2026-03-31 Created: 2026-03-31 Last updated: 2026-04-01

Open Access in DiVA

No full text in DiVA

Authority records

Nguyen, Chanh Le TanElmroth, Erik

Search in DiVA

By author/editor
Nguyen, Chanh Le TanElmroth, Erik
By organisation
Department of Computing Science
Computer SciencesNetworked, Parallel and Distributed Computing

Search outside of DiVA

GoogleGoogle Scholar

urn-nbn

Altmetric score

urn-nbn
Total: 42 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf