CRUCIBLE: Towards unified secure on- and off-line analytics at scale

Peter Coetzee, Stephen Jarvis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

The burgeoning field of data science benefits from the application of a variety of analytic models and techniques to the oft-cited problems of large volume, high velocity data rates, and significant variety in data structure and semantics. Many approaches make use of common analytic techniques in either a streaming or batch processing paradigm. This paper presents progress in developing a framework for the analysis of large-scale datasets using both of these pools of techniques in a unified manner. This includes: (1) a Domain Specific Language (DSL) for describing analyses as a set of Communicating Sequential Processes, fully integrated with the Java type system, including an Integrated Development Environment (IDE) and a compiler which builds idiomatic Java; (2) a runtime model for execution of an analytic in both streaming and batch environments; and (3) a novel approach to automated management of cell-level security labels, applied uniformly across all runtimes. The paper concludes with a demonstration of the successful use of this system with a sample workload developed in (1), and an analysis of the performance characteristics of each of the runtimes described in (2).

Original languageEnglish
Title of host publicationProceedings of DISCS 2013
Subtitle of host publicationThe 2013 International Workshop on Data-Intensive Scalable Computing Systems, Held in conjunction with SC 2013: The International Conference for High Performance Computing, Networking, Storage and Analysis
PublisherAssociation for Computing Machinery
Pages43-48
Number of pages6
ISBN (Electronic)9781450325066
DOIs
Publication statusPublished - 18 Nov 2013
Event2013 International Workshop on Data-Intensive Scalable Computing Systems, DISCS 2013 - Denver, United States
Duration: 18 Nov 2013 → …

Publication series

NameProceedings of DISCS 2013: The 2013 International Workshop on Data-Intensive Scalable Computing Systems, Held in conjunction with SC 2013: The International Conference for High Performance Computing, Networking, Storage and Analysis

Conference

Conference2013 International Workshop on Data-Intensive Scalable Computing Systems, DISCS 2013
Country/TerritoryUnited States
CityDenver
Period18/11/13 → …

Keywords

  • Algorithms
  • Languages
  • Security

ASJC Scopus subject areas

  • Software
  • Computer Networks and Communications
  • Information Systems

Fingerprint

Dive into the research topics of 'CRUCIBLE: Towards unified secure on- and off-line analytics at scale'. Together they form a unique fingerprint.

Cite this