Conference Program

Wednesday, August 27th, 2014

08:15 - 08:45, Registration

08:45 - 09:30, Conference Opening
Room: Auditorium 1

09:30 - 10:30, Keynote 1
Chair: Fernando Silva | Room: Auditorium 1

Paul Watson, Newcastle University, UK
Cloud Computing for Healthcare

11:00 - 12:30, Session A1, T6 - Grid, Cluster and Cloud Computing
Chair: Hervé Paulino | Room: Auditorium 1

Virtual Machine Consolidation in Cloud Data Centers using ACO Metaheuristic
Md Hasanul Ferdaus, Manzur Murshed, Rodrigo N. Calheiros and Rajkumar Buyya

Scientific Workflow Scheduling on Federated Clouds
Juan J. Durillo and Radu Prodan

Locality-aware Cooperation in Distributed IaaS Infrastructures
Jonathan Pastor, Marin Bertier, Frederic Desprez, Flavien Quesnel, Adrien Lebre and Cedric Tedeschi

11:00 - 12:30, Session A2, T11 - Multicore, Manycore Programming
Chair: Raymond Namyst | Room: Auditorium 2

High-Throughput Maps on Message-Passing Manycore Architectures: Partitioning versus Replication
Omid Shahmirzadi, Thomas Ropars and Andre Schiper

A Fast Sparse Block Circulant Matrix Vector Product
Eloy Romero, Andrés Tomás, Antonio Soriano and Ignacio Blanquer

Scheduling data flow program in XKaapi : A new affinity-based algorithm for heterogeneous architectures
Raphaël Bleuse, Thierry Gautier, João V. F. Lima, Gregory Mounie and Denis Trystram

11:00 - 12:30, Session A3, T8 - Distributed Systems and Algorithms
Chair: Luís Veiga | Room: Meeting Room 1

Spanning Tree or Gossip for Aggregation: a Comparative Study
Lehel Nyers and Mark Jelasity

Shades: Expediting Kademlia's Lookup Process
Gil Einziger, Roy Friedman and Yoav Kantor

Analysis and Comparison of Truly Distributed Solvers for Linear Least Squares Problems on Wireless Sensor Networks
Karl E. Prikopa, Hana Straková and Wilfried N. Gansterer.

14:00 - 16:00, Session B1, T15 - GPU and Accelerator Computing
Chair: Paul Kelly | Room: Auditorium 1

Customizing Driving Directions with GPUs
Daniel Delling, Moritz Kobitzsch and Renato Werneck

GPU Accelerated Range Trees with Applications
Manoj Kumar Maramreddy and Kishore Kothapalli

On-Board Scalable Multi-GPU Long-Range Molecular Dynamics
Marcos Novalbos, Jaime González, Miguel Otaduy, Roberto Martinez-Benito and Alberto Sanchez

Resolution of Linear Algebra for the Discrete Logarithm Problem using GPU and Multi-core Architectures
Hamza Jeljeli

14:00 - 16:00, Session B2, T3 - Scheduling and Load Balancing
Chair: Wolfgang Nagel | Room: Auditorium 2

On Interactions Among Scheduling Policies: Finding Efficient Queue Setup Using High-Resolution Simulations
Dalibor Klusacek and Simon Toth

ProPS: A Progressively Pessimistic Scheduler for Software Transactional Memory
Hugo Rito and João Cachopo

A Queueing Theory Approach to Pareto Optimal Bags-of-Tasks Scheduling on Clouds
Cosmin Dumitru, Ana-Maria Oprescu, Miroslav Zivkovic, Rob van der Mei, Paola Grosso and Cees de Laat

SPAGHETtI: Scheduling/Placement Approach for task-Graphs on HETerogeneous architecture
Denis Barthou and Emmanuel Jeannot

16:30 - 18:00, Session C1, T9 - Parallel and Distributed Programming
Chair: Ron Perrott | Room: Auditorium 1

High-Performance Computer Algebra - A Parallel Hecke Algebra Case Study
Patrick Maier, Daria Livesey, Hans-Wolfgang Loidl and Phil Trinder

Generic Algorithms for Deterministic Random Number Generation in Dynamic-Multithreaded Platforms
Stefano Mor, Jean-Louis Roch and Nicolas Maillard

Implementation and Performance Analysis of SkelGIS for Network Mesh-based Simulations
Hélène Coullon and Sébastien Limet

16:30 - 18:00, Session C2, T14 - High-Performance and Scientific Applications
Chair: Allen Malony | Room: Auditorium 2

Random Fields Generation on the GPU with the Spectral Turning Bands Method
Lars Hunger, Biagio Cosenza, Stefan Kimeswenger and Thomas Fahringer.

Fast Set Intersection through Run-time Bitmap Construction over PForDelta-compressed Indexes
Xiaocheng Zou, Sriram Lakshminarasimhan, David Boyuka Ii, Stephen Ranshous, Houjun Tang, Scott Klasky and Nagiza Samatova

Hybrid CPU/GPU Acceleration of Detection of 2-SNP Epistatic Interactions in GWAS
Jorge González-Domínguez, Bertil Schmidt, Jan Christian Kässens and Lars Wienbrandt

16:30 - 18:00, Session C3, T4 - High Performance Architectures and Compilers
Chair: Jesper Träff | Room: Meeting Room 1

Automated Transformation of GPU-Specific OpenCL Kernels Targeting Performance Portability on Multi-Core/Many-Core CPUs
Dafei Huang, Mei Wen, Changqing Xun, Dong Chen, Xing Cai, Nan Wu and Chunyuan Zhang

Switchable Scheduling for Runtime Adaptation of Optimization
Cédric Bastoul and Lénaïc Bagnères

A New GCC Plugin-Based Compiler Pass to Add Support for Thread-Level Speculation into OpenMP
Sergio Aldea, Alvaro Estebanez, Diego R. Llanos and Arturo Gonzalez-Escribano


Thursday, August 28th, 2014

9:00 - 10:00, Keynote 2
Chair: Chris Lengauer | Room: Auditorium 1

Henri Bal, Vrije Universiteit, The Netherlands
Going Dutch: how to share a dedicated distributed infrastructure for Computer Science Research

10:30 - 12:30, Session D1, T7 - Green High Performance Computing / T3 - Scheduling and Load Balancing
Chair: Ricardo Bianchini | Room: Auditorium 1

Power-Aware L1 and L2 Caches for GPGPUs
Ehsan Atoofian and Ali Manzak

Power Consumption Due to Data Movement in Distributed Programming Models
Siddhartha Jana, Oscar Hernandez, Stephen Poole and Barbara Chapman

Energy-Aware Multi-Organization Scheduling Problem
Johanne Cohen, Daniel Cordeiro and Pedro Luis F. Raphael

Energy Efficient Scheduling of MapReduce Jobs
Evripidis Bampis, Vincent Chau, Dimitrios Letsios, Giorgio Lucarelli, Ioannis Milis and Giorgios Zois

10:30 - 12:30, Session D2, T15 - GPU and Accelerator Computing
Chair: Benedict Gaster | Room: Auditorium 2

Toward OpenCL Automatic Multi-Device Support
Sylvain Henry, Alexandre Denis, Denis Barthou, Marie-Christine Counilh and Raymond Namyst

Concurrent Kernel Execution on Xeon Phi within Parallel Heterogeneous Workloads
Florian Wende, Thomas Steinke and Frank Cordes

Writing self-adaptive codes for heterogeneous systems
Jorge F. Fabeiro, Diego Andrade, Basilio B. Fraguela and Ramón Doallo

A Pattern-Based Comparison of OpenACC and OpenMP for Accelerator Computing
Sandra Wienke, Christian Terboven, James C. Beyer and Matthias S. Müller

10:30 - 12:30, Session D3, T10 - Parallel Numerical Algorithms
Room: Meeting Room 1

A distributed CPU-GPU sparse direct solver
Piyush Sao, Richard Vuduc and Xiaoye Sherry Li

Parallel Computation of Echelon Forms
Jean-Guillaume Dumas, Thierry Gautier, Clément Pernet and Ziad Sultan

Time-domain BEM for Wave Equation: Optimization and Hybrid Parallelization
Berenger Bramas, Olivier Coulaud and Guillaume Sylvand

Structured Orthogonal Inversion of Block p-cyclic Matrices on Multicores with GPU Accelerators
Sergiy Gogolenko, Zhaojun Bai and Richard Scalettar

14:00 - 16:00, Session E1, T2 - Performance Prediction and Evaluation
Chair: Rizos Sakellariou | Room: Auditorium 1

Multi-Objective Auto-Tuning with Insieme: Optimization and Trade-Off Analysis for Time, Energy and Resource Usage
Philipp Gschwandtner, Juan Durillo and Thomas Fahringer

Performance Prediction and Evaluation of Parallel Applications in KVM, Xen, and VMware
Cheol-Ho Hong, Beom-Joon Kim, Young-Pil Kim, Hyunchan Park and Chuck Yoo

DReAM: Per-Task DRAM Energy Metering in Multicore Systems
Qixiao Liu, Miquel Moreto, Jaume Abella, Francisco Cazorla and Mateo Valero

Characterizing the Performance-Energy Tradeoff of Small ARM Cores in HPC Computation
Michael Laurenzano, Ananta Tiwari, Adam Jundt, Joshua Peraza, Laura Carrington, Roy Campbell and William Ward

14:00 - 16:00, Session E2, T5 - Parallel and Distributed Data Management
Chair: Paul Watson | Room: Auditorium 2

Improving Read Performance with Online Access Pattern Analysis and Prefetching
Houjun Tang, Xiaocheng Zou, Jonathan Jenkins, David A. Boyuka Ii, Stephen Ranshous, Dries Kimpe, Scott Klasky and Nagiza F. Samatova

Robust and Efficient Large-Large Table Outer Joins on Distributed Infrastructures
Long Cheng, Spyros Kotoulas, Tomas Ward and Georgios Theodoropoulos

Top-k Item Identification on Dynamic and Distributed Datasets
Alessio Guerrieri, Alberto Montresor and Yannis Velegrakis

Applying selectively parallel I/O compression to parallel storage systems
Rosa Filgueira, Yusuke Tanimura, Malcolm Atkinson and Isao Kojima

14:00 - 16:00, Session E3, T1 - Support Tools and Environment
Chair: Thomas Ludwig | Room: Meeting Room 1

MPI Trace Compression using Event Flow Graphs
Xavier Aguilar, Karl Fuerlinger and Erwin Laure

ScalaJack: Customized Scalable Tracing with in-situ Data Analysis
Srinath Krishna Ananthakrishnan and Frank Mueller

Performance Measurement and Analysis of Transactional Memory and Speculative Execution on IBM Blue Gene/Q
Jie Jiang, Peter Philippen, Michael Knobloch and Bernd Mohr

c-Eclipse: An Open-Source Management Framework for Cloud Applications
Chrystalla Sofokleous, Nicholas Loulloudes, Demetris Trihinas, George Pallis and Marios Dikaiakos

16:30 - 18:00, Panel Session
Chair: Paul Kelly | Room: Auditorium 1

The future of parallel architectures, languages and tools is domain-specific - or is it?


Friday, August 29th, 2014

09:00 - 10:30, Session F1, T11 - Multicore, Manycore Programming
Chair: Jesper Träff | Room: Auditorium 1

Delegation Locking Libraries for Improved Performance of Multithreaded Programs
David Klaftenegger, Konstantinos Sagonas and Kjell Winblad

A Generic Strategy for Multi-Stage Stencil Applications
Mauro Bianco and Benjamin Cumming

Evaluation of OpenMP Task Scheduling Algorithms for Large NUMA Architectures
Jerome Clet-Ortega, Patrick Carribault and Marc Perache

09:00 - 10:30, Session F2, T2 - Performance Prediction and Evaluation
Chair: Leonel Sousa | Room: Auditorium 2

Modeling and Simulation of a Dynamic Task-Based Runtime System for Heterogeneous Multi-Core Architectures
Luka Stanisic, Samuel Thibault, Arnaud Legrand, Brice Videau and Jean-François Méhaut

Modeling the Impact of Reduced Memory Bandwidth on HPC Applications
Ananta Tiwari, Anthony Gamst, Michael Laurenzano, Martin Schulz and Laura Carrington

ParaShares: Finding the Important Basic Blocks in Multithreaded Programs
Melanie Kambadur, Kui Tang and Martha A. Kim

09:00 - 10:30, Session F3, T14 - High-Performance and Scientific Applications
Chair: Inês Dutra | Room: Meeting Room 1

IFM: A Scalable High Resolution Flood Modeling Framework
Swati Singhal, Sandhya Aneja, Frank Liu, Lucas Real and Thomas George.

High Performance Pseudo-analytical Simulation of Multi-object Adaptive Optics over Multi-GPU Systems
Ahmad Abdelfattah, Eric Gendron, Damien Gratadour, David Keyes, Hatem Ltaief, Arnaud Sevin and Fabrice Vidal

Parallel dual tree traversal on multi-core and many-core architectures for astrophysical N-body simulations
Benoit Lange and Pierre Fortin

11:00 - 12:30, Session G1, T9 - Parallel and Distributed Programming
Chair: Henri Bal | Room: Auditorium 1

GoFFish: A Sub-Graph Centric Framework for Large-Scale Graph Analytics
Yogesh Simmhan, Alok Kumbhare, Charith Wickramaarachchi and Viktor Prasanna

Resolving Semantic Conflicts in Word Based Software Transactional Memory
Craig Sharp, William Blewitt and Graham Morgan

Automatic Tuning of the Parallelism Degree in Hardware Transactional Memory
Diego Rughetti, Paolo Romano, Francesco Quaglia and Bruno Ciciani

11:00 - 12:30, Session G2, T12 - Theory and Algorithms for Parallel Computation / T5 – Parallel and Distributed Data Management
Chair: Pedro Ribeiro | Room: Auditorium 2

Power-aware replica placement in tree networks with multiple servers per client
Guillaume Aupy, Anne Benoit, Matthieu Journault and Yves Robert.

On Constructing DAG-Schedules with Large AREAs
Scott Roche, Arnold Rosenberg and Rajmohan Rajaraman.

Ultra-fast Load Balancing of Distributed Key-Value Stores through Network-assisted Lookups
Davide De Cesaris, Kostas Katrinis, Spyros Kotoulas and Antonio Corradi

11:00 - 12:30, Session G3, T13 - High Performance Networks and Communication / T6 - Grid, Cluster and Cloud Computing
Chair: Luís Lopes | Room: Meeting Room 1

Software Defined Multicasting For MPI Collective Operation Offloading with the NetFPGA
Omer Arap, Geoffrey Brown, Bryce Himebaugh and Martin Swany

MapReduce over Lustre: Can RDMA-based Approach Benefit?
Md Rahman, Xiaoyi Lu, Nusrat Islam, Raghunath Rajachandrasekar and Dhabaleswar Panda

Can Inter-VM Shmem Benefit MPI Applications on SR-IOV based Virtualized InfiniBand Clusters?
Jie Zhang, Xiaoyi Lu, Jithin Jose, Rong Shi and Dhabaleswar K. Panda

14:00 - 15:30, Keynote 3
Chair: Inês Dutra | Room: Auditorium 1

Ricardo Bianchini, Rutgers University and Microsoft, USA
Greening Datacenters: Past, Present, and Future

15:30 - 16:30, Closing
Chair: Fernando Silva | Room: Auditorium 1