Modeling polarization for Hyper-NA lithography tools and masks
Kafai Lai, Alan E. Rosenbluth, et al.
SPIE Advanced Lithography 2007
SystemML aims at declarative, large-scale machine learning (ML) on top of MapReduce, where high-level ML scripts with R-like syntax are compiled to programs of MR jobs. The declarative specication of ML algorithms enables-in contrast to existing large-scale machine learning libraries-automatic optimization. SystemML's primary focus is on data parallelism but many ML algorithms inherently exhibit opportunities for task parallelism as well. A major challenge is how to eficiently combine both types of parallelism for arbitrary ML scripts and workloads. In this paper, we present a systematic approach for combining task and data parallelism for large-scale machine learning on top of MapReduce. We employ a generic Parallel FOR construct (ParFOR) as known from high performance computing (HPC). Our core contributions are (1) complementary parallelization strategies for exploiting multi-core and cluster parallelism, as well as (2) a novel cost-based optimization framework for automatically creating optimal parallel execution plans. Experiments on a variety of use cases showed that this achieves both eficiency and scalability due to automatic adaptation to ad-hoc workloads and unknown data characteristics. © 2014 VLDB Endowment.
Kafai Lai, Alan E. Rosenbluth, et al.
SPIE Advanced Lithography 2007
Elliot Linzer, M. Vetterli
Computing
Lerong Cheng, Jinjun Xiong, et al.
ASP-DAC 2008
Thomas M. Cover
IEEE Trans. Inf. Theory