PREMA.CDT3D
From crtc.cs.odu.edu
Decoupled Approach for CDT3D
- Take advantage of CDT3D performance and mesh quality capability and create a ‘coarse’ mesh
- Decompose the mesh into N subdomains (N > P)
- Distribute and refine each subdomain in parallel with no communication.
Domain Decomposition
Due to the absence of appropriate software simple heuristics have been employed
Preliminary Results (May 2017)
empty | User Handler threads | Total Time | Subdomain Migrations |
---|---|---|---|
Pthreads (32) | - | 3288 | - |
PREMA 64 processes, 128 cores without ILB |
1 | 579 | 0 |
PREMA 64 processes, 128 cores with ILB. |
1 | 333 | 123 |
PREMA 42 processes, 126 cores with ILB. |
2 | 263 | 215 |
PREMA 25 processes, 125 cores with ILB. |
3 | 197 | 136 |