Research Experience for Undergraduates, Summer 2017

In the Summer of 2017, with support provided by the NSF, CREST hosted two undergraduates for an REU to work on the DASHMM project. Below you will find details about the program participants, their activities, and the initial announcement for the program.

Program Participants

Silvio Mayolo

Silvio Mayolo is an undergraduate student studying Mathematics and Computer Science at Tennessee Technological University. In the summer of 2017, he worked under an REU grant on optimization and performance analysis for DASHMM, in addition to designing a backend pathfinding project for future DASHMM development. He plans to pursue a doctorate in mathematics and continue doing research in the future.

Drake Niedzielski

Drake Niedzielski is currently an undergraduate student majoring in Physics and Mathematics at Rensselaer Polytechnic Institute, Troy NY. He worked at CREST during the summer of 2017 on improving the performance of DASHMM and integrating DASHMM into AFMPB. He plans to pursue a doctorate in physics.

Research Performed

Silvio and Drake worked on two broad areas of DASHMM during this program: the integration of DASHMM into an existing FMM-based code, AFMPB, and on performance analysis and optimization of the current DASHMM.


The Adaptive Fast Multipole Poisson-Boltzmann solver (AFMPB) is an existing code that uses the fast multipole method (FMM). AFMPB was mildly parallelized, but only for a single node. By extending AFMPB to distributed memory parallel operation, new science would be enabled. But to use DASHMM to provide a distributed FMM for AFMPB required some changes to DASHMM.

Repeat Evaluation

AFMPB uses FMM in an iterative method, which means that the same overall computation is performed, but for different inputs. At the start of the REU program, DASHMM only performed entire evaluations, during which not only was the computation performed, but the various setup tasks were performed, before being taken down again before returning. For example, two auxiliary structures used by DASHMM are a tree and a directed acyclic graph (DAG). But in an iterative setting, the tree and DAG can be reused, so creating them anew each iteration only adds overhead to the computation. To fix this gap in DASHMM, the evaluation was split into pieces, so that reuse of the DAG and tree became possible. This enabled AFMPB to not only use DASHMM, but to use it well.

Iterative Method

Having enabled the iterative reuse of the tree and DAG, the next step was to provide the glue between the evaluations of the DAG. So a distributed version of GMRES was created that worked with DASHMM and which provided the central driver of the AFMPB application.

Data Distribution

DASHMM distributes the source and target data among the available computational resources. And a poor choice of distribution can lead to scaling and performance degradation. At the start of the REU program, DASHMM came equipped with a distribution scheme that worked well for volumetric data, that is, data where the sources and targets are distributed over a large fraction of the overall domain. However, for surface data with a lower effective dimension than a volume, the default distribution scheme has trouble for some rank counts. The bulk of AFMPB's evaluations are for surface data so an improved distribution will improve the scaling of AFMPB. A number of alternatives to the default DASHMM distribution were investigated, and each showed improvements over the default. Detailed testing will ultimately select the best of the newly developed schemes to be included as the default in DASHMM.

Performance Analysis and Backend Isolation

In addition to the target application work, there was significant effort expended to measure and improve the performance of DASHMM.

Constrained Concurrency

The DAG representing the computation in a typical DASHMM evaluation is complex, and has many paths from sources to targets. Not all of this work is on the critical path through the DAG. So it is important to schedule the work in DASHMM to be aware of this critical path. Unfortunately, HPX-5 does not provide any mechanism for specifying task priority, and so the runtime is unable to make a critical path-aware schedule. The result is that there is a period of constrained concurrency near the end of the evaluation that impacts the overall scaling of DASHMM.

Prior evidence on a single shared memory node suggested that manually cutting the DAG into two parts might alleviate this problem. So some work was done to implement such a cut in DASHMM. This involved developing an understanding of HPX-5 and of DASHMM's internal workings. Unfortunately, the success on a single rank was not replicated for the fully distributed DASHMM. All attempts to override the HPX-5 scheduler merely shifted the constrained concurrency around, and in some cases degraded the overall performance. What was called for was a true solution to the problem: a priority scheduler in HPX-5.

Backend Isolation

As modifying HPX-5 was well outside the purview of this REU program, the work shifted to a software architecture problem that, when solved, will make managing and adapting to changes in the runtime system much easier. DASHMM's first goal was to hide HPX-5 from its users, making it as easy to use as possible. DASHMM provides a distributed implementation of common Hierarchical Multipole Methods (HMM), without the user needing to worry about the parallization of those methods. Under the hood, however, DASHMM is tightly coupled to HPX-5. While this achieves the primary ease-of-use goal for DASHMM, it does introduce potential maintenance issues as HPX-5 changes. To adapt to a changing runtime system, the software architecture of DASHMM is going to be changed to isolate DASHMM from the details of HPX-5. This will allow for not only an easier maintenance when HPX-5 suffers breaking changes, but also allow for head-to-head comparisons of competing runtime systems with a real scientific application.

To this end BARIUM-1, a pathfinding project to abstract away the details of HPX-5 from DASHMM, was developed. In addition to an HPX-5 backend, a C++/MPI backend was implemented. Then a number of test problems were created that use the various backends, and which demonstrate the performance and interchangability of those backends. Though DASHMM is not ready for the relatively advanced backend isolation provided by BARIUM-1, in the near future, it will begin to be integrated into the software architecture of DASHMM.

Program Announcement

The Center for Research in Extreme Scale Technologies (CREST) invites applications for a 10 week summer REU program. Students will participate in computational science research at a world-class institution, while learning about emerging high-performance computing trends and gaining hands-on research experience. Students will work directly with faculty and researchers at CREST to develop DASHMM, a computational toolkit being developed at CREST to leverage the power of dynamic adaptive runtime techniques to improve the scalability and efficiency of multipole methods.

Program Details

The program, sponsored by the National Science Foundation, funds two students for 10 weeks, during which they will work intensively with CREST members on projects related to DASHMM. The interests of the students will direct the specific project that each student participates in. Topics include, but are not limited to: developing novel visualizations for existing multipole method codes, implementing new physical models for inclusion with DASHMM, or using DASHMM in end-science applications.

The program will begin on the 30th of May, 2017 and end on the 4th of August, 2017. Students will be given a stipend of $5000. Additional support will be provided for housing and food during the program.

Successful applicants will:

  • Be a US citizen enrolled full-time in an accredited US institution in good academic standing
  • Have full time availability during the program period (May 30, 2017 through August 4, 2017)
  • Be interested in one or more of the following: multipole methods, computer science, computational science, N-body like problems, computational electromagnetics, scientific visualization
  • Have the ability to work in a team of like-minded individuals
  • Be self-motivated and goal-oriented.

To read about past editions of this program, please visit the 2015 REU page.

How to Apply

Interested applicants should send the following to jdebuhr [at], with the subject line “REU”

  • A letter of interest which includes the applicant's background and interest in the REU program, prior research or REU experience, experience with computational methods, a list of programming proficiencies (for example, C, C++, OpenGL, etc.)
  • A letter of recommendation from a Professor or Research Supervisor (Applicants should arrange to have the letter sent directly from their letter writer)
  • An unofficial transcript.

Applications should be submitted by the 24th of February.