Improving the fault resilience of overlay multicast for media streaming

Guang Tan*, Stephen A. Jarvis

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

38 Citations (Scopus)


A key technical challenge for overlay multicast is that the highly dynamic multicast members can make data delivery unreliable. In this paper, we address this issue in the context of live media streaming by exploring 1) how to construct a stable multicast tree that minimizes the negative impact of frequent member departures on an existing overlay and 2) how to efficiently recover from packet errors caused by end-system or network failures. For the first problem, we identify two layout schemes for the tree nodes, namely, the bandwidth-ordered tree and the time-ordered tree, which represent two typical approaches to improving tree reliability, and conduct a stochastic analysis on their properties regarding reliability and tree depth. Based on the findings, we propose a distributed Reliability-Oriented Switching Tree (ROST) algorithm that minimizes the failure correlation among tree nodes. Compared with some commonly used distributed algorithms, the ROST algorithm significantly improves tree reliability and reduces average service delay, while incurring only a small protocol overhead; furthermore, it features a mechanism that prevents cheating or malicious behaviors in the exchange of bandwidth/time information. For the second problem, we develop a simple Cooperative Error Recovery (CER) protocol that helps recover from packet errors efficiently. Recognizing that a single recovery source is usually incapable of providing the timely delivery of the lost data, the protocol recovers from data outages using the residual bandwidths from multiple sources, which are identified using a minimum-loss-correlation algorithm. Extensive simulations demonstrate the effectiveness of the proposed schemes.

Original languageEnglish
Pages (from-to)721-734
Number of pages14
JournalIEEE Transactions on Parallel and Distributed Systems
Issue number6
Publication statusPublished - Jun 2007

Bibliographical note

Funding Information:
The authors are grateful to the anonymous reviewers for their excellent feedback. This research was sponsored in part by grants from the NASA Ames Research Center (administrated by US Army Research and Development Standardization Group (USARDSG), under contract no. N68171-01-C-9012), the Engineering and Physical Sciences Research Council (EPSRC) (contract no. GR/R47424/01), and the EPSRC e-Science Core Programme (contract no. GR/S03058/01).


  • Fault resilience
  • Media streaming
  • Multicast
  • Overlay
  • Peer-to-peer
  • Reliability

ASJC Scopus subject areas

  • Signal Processing
  • Hardware and Architecture
  • Computational Theory and Mathematics


Dive into the research topics of 'Improving the fault resilience of overlay multicast for media streaming'. Together they form a unique fingerprint.

Cite this