Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids

Ligang He; Stephen A. Jarvis; Daniel P. Spooner; Xinuo Chen; Graham R. Nudd

doi:10.1109/GRID.2004.27

Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids

Ligang He^*, Stephen A. Jarvis, Daniel P. Spooner, Xinuo Chen, Graham R. Nudd

^*Corresponding author for this work

Engineering and Physical Sciences

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

47 Citations (Scopus)

Abstract

This paper addresses the dynamic scheduling of parallel jobs with QoS demands (soft-deadlines) in multiclusters and grids. Three metrics (over-deadline, makespan and idle-time) are combined with variable weights to evaluate the scheduling performance. These three metrics are used to measure the extent of the jobs' QoS demand compliance, the resource throughput and the resource utilization. Two levels of performance optimisation are applied in the multicluster. At the multicluster level, a scheduler (which we call MUSCLE) allocates parallel jobs with high packing potential to the same cluster; it also takes the jobs' QoS requirements into account and employs a heuristic to allocate suitable workloads to each cluster to balance the overall system performance. At the single cluster level, an existing workload manager, called TITAN, utilizes a genetic algorithm to further improve the scheduling performance of the jobs previously allocated by MUSCLE. Extensive experimental studies are conducted to verify the effectiveness of the scheduling mechanism as well as the effect of the prediction accuracy on the scheduling performance. The results show that compared with traditional distributed workload allocation policies, the comprehensive scheduling performance of parallel jobs is significantly improved across the multicluster, and the presence of prediction errors does not dramatically weaken the performance advantage.

Original language	English
Title of host publication	Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing
Editors	R. Buyya
Pages	402-409
Number of pages	8
DOIs	https://doi.org/10.1109/GRID.2004.27
Publication status	Published - 2004
Event	Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing - Pittsburgh, PA, United States Duration: 8 Nov 2004 → 8 Nov 2004

Publication series

Name	Proceedings - IEEE/ACM International Workshop on Grid Computing
ISSN (Print)	1550-5510

Conference

Conference	Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing
Country/Territory	United States
City	Pittsburgh, PA
Period	8/11/04 → 8/11/04

ASJC Scopus subject areas

Engineering(all)

Access to Document

10.1109/GRID.2004.27

Cite this

@inproceedings{8bf16888afcc4f29a449266d720508e0,

title = "Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids",

abstract = "This paper addresses the dynamic scheduling of parallel jobs with QoS demands (soft-deadlines) in multiclusters and grids. Three metrics (over-deadline, makespan and idle-time) are combined with variable weights to evaluate the scheduling performance. These three metrics are used to measure the extent of the jobs' QoS demand compliance, the resource throughput and the resource utilization. Two levels of performance optimisation are applied in the multicluster. At the multicluster level, a scheduler (which we call MUSCLE) allocates parallel jobs with high packing potential to the same cluster; it also takes the jobs' QoS requirements into account and employs a heuristic to allocate suitable workloads to each cluster to balance the overall system performance. At the single cluster level, an existing workload manager, called TITAN, utilizes a genetic algorithm to further improve the scheduling performance of the jobs previously allocated by MUSCLE. Extensive experimental studies are conducted to verify the effectiveness of the scheduling mechanism as well as the effect of the prediction accuracy on the scheduling performance. The results show that compared with traditional distributed workload allocation policies, the comprehensive scheduling performance of parallel jobs is significantly improved across the multicluster, and the presence of prediction errors does not dramatically weaken the performance advantage.",

author = "Ligang He and Jarvis, {Stephen A.} and Spooner, {Daniel P.} and Xinuo Chen and Nudd, {Graham R.}",

year = "2004",

doi = "10.1109/GRID.2004.27",

language = "English",

isbn = "0769522564",

series = "Proceedings - IEEE/ACM International Workshop on Grid Computing",

pages = "402--409",

editor = "R. Buyya",

booktitle = "Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing",

note = "Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing ; Conference date: 08-11-2004 Through 08-11-2004",

}

He, L, Jarvis, SA, Spooner, DP, Chen, X & Nudd, GR 2004, Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids. in R Buyya (ed.), Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing. Proceedings - IEEE/ACM International Workshop on Grid Computing, pp. 402-409, Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing, Pittsburgh, PA, United States, 8/11/04. https://doi.org/10.1109/GRID.2004.27

Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids. / He, Ligang; Jarvis, Stephen A.; Spooner, Daniel P. et al.
Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing. ed. / R. Buyya. 2004. p. 402-409 (Proceedings - IEEE/ACM International Workshop on Grid Computing).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

TY - GEN

T1 - Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids

AU - He, Ligang

AU - Jarvis, Stephen A.

AU - Spooner, Daniel P.

AU - Chen, Xinuo

AU - Nudd, Graham R.

PY - 2004

Y1 - 2004

N2 - This paper addresses the dynamic scheduling of parallel jobs with QoS demands (soft-deadlines) in multiclusters and grids. Three metrics (over-deadline, makespan and idle-time) are combined with variable weights to evaluate the scheduling performance. These three metrics are used to measure the extent of the jobs' QoS demand compliance, the resource throughput and the resource utilization. Two levels of performance optimisation are applied in the multicluster. At the multicluster level, a scheduler (which we call MUSCLE) allocates parallel jobs with high packing potential to the same cluster; it also takes the jobs' QoS requirements into account and employs a heuristic to allocate suitable workloads to each cluster to balance the overall system performance. At the single cluster level, an existing workload manager, called TITAN, utilizes a genetic algorithm to further improve the scheduling performance of the jobs previously allocated by MUSCLE. Extensive experimental studies are conducted to verify the effectiveness of the scheduling mechanism as well as the effect of the prediction accuracy on the scheduling performance. The results show that compared with traditional distributed workload allocation policies, the comprehensive scheduling performance of parallel jobs is significantly improved across the multicluster, and the presence of prediction errors does not dramatically weaken the performance advantage.

AB - This paper addresses the dynamic scheduling of parallel jobs with QoS demands (soft-deadlines) in multiclusters and grids. Three metrics (over-deadline, makespan and idle-time) are combined with variable weights to evaluate the scheduling performance. These three metrics are used to measure the extent of the jobs' QoS demand compliance, the resource throughput and the resource utilization. Two levels of performance optimisation are applied in the multicluster. At the multicluster level, a scheduler (which we call MUSCLE) allocates parallel jobs with high packing potential to the same cluster; it also takes the jobs' QoS requirements into account and employs a heuristic to allocate suitable workloads to each cluster to balance the overall system performance. At the single cluster level, an existing workload manager, called TITAN, utilizes a genetic algorithm to further improve the scheduling performance of the jobs previously allocated by MUSCLE. Extensive experimental studies are conducted to verify the effectiveness of the scheduling mechanism as well as the effect of the prediction accuracy on the scheduling performance. The results show that compared with traditional distributed workload allocation policies, the comprehensive scheduling performance of parallel jobs is significantly improved across the multicluster, and the presence of prediction errors does not dramatically weaken the performance advantage.

UR - http://www.scopus.com/inward/record.url?scp=19944379474&partnerID=8YFLogxK

U2 - 10.1109/GRID.2004.27

DO - 10.1109/GRID.2004.27

M3 - Conference contribution

AN - SCOPUS:19944379474

SN - 0769522564

T3 - Proceedings - IEEE/ACM International Workshop on Grid Computing

SP - 402

EP - 409

BT - Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing

A2 - Buyya, R.

T2 - Proceedings - Fifth IEEE/ACM International Workshop on Grid Computing

Y2 - 8 November 2004 through 8 November 2004

ER -

Dynamic scheduling of parallel jobs with QoS demands in multiclusters and grids

Abstract

Publication series

Conference

ASJC Scopus subject areas

Access to Document

Fingerprint

Cite this