| Boston University |
- Numerical embedding of large-scale communication networks for efficient packet forwarding.
- Study of the relationship between the topology and geometry of naturally arising network graphs.
|
| Brigham Young University |
- Integrating FPGAs with CPUs and GPUs via fast on-chip data network
|
| Carnegie Mellon University |
- Scalable and efficient memory subset and interconnect designs for heterogeneous systems
|
| Clemson |
- Acceleration of spiking neural networks on single-GPU and multi-GPU systems
- Acceleration of anisotropic diffusion algorithms to remove image noise
- Molecular dynamics and bioinformatics
|
| Edinburgh |
|
| Duke University |
- CPU/GPU chip architecture with cache-coherent shared virtual memory
- Extending pointer-based software to run efficiently on CPU/GPU chips
|
| Harvard University |
|
| Johns Hopkins University |
- Using GPU cores to speed Monte Carlo simulations of microeconomic models
- Christopher Carroll and a team of students are speeding up the performance of Simulated Method of Moments estimation of Dynamic Stochastic Optimization models of life cycle and buffer stock saving. For the latest public version of their code, click link below:
http://econ.jhu.edu/people/ccarroll/SolvingMicroDSOPs/
|
| Frankfurt Institute Advanced Studies |
- Locality aware task scheduling on large parallel many-core systems
|
| National Chiao-Tung University – Taiwan |
- Development of rigorous HPC codes using GPU acceleration for lattice QCD, UrQMD, event reconstruction in particle physics, DGEMM and Linpack
|
| New York University |
- Project One
- Engineering-scale high-order PDE solvers for hyperbolic and elliptic problems
- Tools and languages for GPU programming in OpenCL
- Project Two
- Hardware/software interaction: How much does the programmer need to make the best use of the hardware without sacrificing portability and development time?
- Real General purpose GPUs: Can GPUs execute non-GPU-intended software such as scripting language programs?"
|
| Newcastle University |
- An object-oriented 2D hydraulic model for execution on the GPU
- Simulation of flooding in urban areas, using GPU to simulate flow at high resolution on a dynamically adaptive grid
|
| Michigan Technological University |
-
APU Parallel Software for Convex Optimization
-
An efficient annotated parallel convex optimization solver and an associated highly optimized library of convex optimization functions for multi-core and GPU computing units.
|
| MIT |
- Microphotonics Center Industry Consortium
|
| Princeton University |
- Novel chip architectures for the data center
- Integration of GPUs into the cloud data center
- Prefetching and synchronization primitives in order to reduce or hide communication latencies between a CPU and GPU or between different functional units within the GPU. Our target is latency-sensitive (as opposed to bandwidth-oriented) computations.
- Regression-based design space exploration methods tailored to GPUs. Allow quick navigation of complicated hardware and software parameter spaces, with high-accuracy performance prediction. We are currently expanding our methods to handle a wide variety of GPU platforms.
|
| Queen’s University |
- OpenCL development of a Finite-Difference Time-Domain Phononic Crystal Simulator
|
| Rice University |
- Acceleration of discontinuous Galerkin based simulation codes for computational electromagnetics and fluid dynamics.
- Loo.py: Automatic generation of OpenCL kernels for numerical methods through high-level code transformations.
|
| Stanford |
- Delite: A framework for building high-performance embedded DSLs targeting heterogeneous systems
- Framework support for domain-specific optimization, extensibility/interoperability across DSLs, and debugging
- Heterogeneous target code generation and resource management for systems including discrete & integrated GPUs
- Optimized DSL kernel generation for OpenCL targets
- Memory hierarchy and data path optimization to maximize performance on APUs
|
| SUNY – Albany |
- Research into fundamental failure physics of lead-free solders
- Failure modeling in lead free solders for failure prediction and reliability simulation
|
|
Technical University Munich
|
- Mapping genomic diversity through in silico full mutagenesis
|
| Technical University Munich |
- Mapping genomic diversity through in silico full mutagenesis
|
| Unicamp |
- Development of techniques and algorithms to speed up content-based image retrieval (CBIR) tasks on AMD GPUs
|
| University of Athens |
- Performance Structures Dependability Analysis in Microprocessor Architectures
- Online Error Detection and Diagnosis in High-Performance Microprocessors
|
| University of Bristol |
|
| University of British Columbia |
- Complex data structure supports on GPUs
|
| University of California, Davis |
- OpenCL Rasterizer: This project is continuing work done at AMD during the summer, developing a complete software rendering pipeline in OpenCL. The continued work on this project is looking at pluging in alternative renderers, e.g. ray shading, and also at using the channel framework developed as part of Fusion.Next.
- Piko: This project is developing a general framework for task based execution on the GPU, which includes graphics and general purpose compute pipeines.
|
| University of California – San Diego |
- Novel parallelization tools for multicore and next-generation GPUs
- New program analyses for discovering and planning for parallelization
- GPU acceleration for stereo to multiview video conversion
|
| University of Delaware |
- Backend optimizations that reduce register pressure in GPUs
|
| University of Massachusetts – Amherst |
- Real-time interactive fluid dynamics simulation of incompressible fluid design and fluid control applications
- Atmospheric mixing (local on-site weather prediction), jet and internal combustion engine performance and emissions, and vehicle drag reduction.
|
| University of Montreal |
- Development of a strided n-dimensional array for GPU. https://github.com/inducer/compyte/wiki
- High-performance image synthesis, using both traditional programmable rasterization pipelines (e.g. shaders), as well as GPGPU APIs such as OpenCL
|
| University of North Texas |
|
| University of Tennessee – Knoxville |
- Development of a dense linear algebra library similar to LAPACK but for heterogeneous/hybrid architectures, starting with current "Multicore+GPU" systems.
- Hybridization methodology where algorithms of interest are split into tasks of varying granularity and their execution scheduled over the available hardware components.
|
| University of Toronto |
- On-chip sensors for adaptive power control
- Power aware cache based structure design
- Thermal modeling of next generation high-end processors
|
| University of Virginia |
|
| University of Washington |
- GPU acceleration of 3D Modeling from photographs
|
| University of Wisconsin-Madison |
- Designing and optimizing concurrent runtime environments for heterogeneous architectures
|