- Introduces Parallelized Multi-Scale Attention (PMSA) to process local patch interactions and global cross-patch interactions in one unified attention operation.
- Proposes MSPT, a multi-block transformer architecture for arbitrary geometries and varying resolutions via ball-tree partitioning and supernode pooling.
- Demonstrates strong benchmark performance across standard PDE tasks and large-scale aerodynamic datasets (ShapeNet-Car, AhmedML), with favorable efficiency scaling.