Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow optional precompilation of JiT kernels #1784

Open
zatkins-dev opened this issue Mar 18, 2025 · 5 comments
Open

Allow optional precompilation of JiT kernels #1784

zatkins-dev opened this issue Mar 18, 2025 · 5 comments
Assignees

Comments

@zatkins-dev
Copy link
Collaborator

I'm running into a bit of a challenge when it comes to MPM perf tests. For FEM, we can use preloading to force CEED kernels to compile. For MPM, however, we

  1. can't use preloading, since it would mess with the swarm values
  2. have no way to preload between timesteps, when the operator kernels are rebuilt

Do you think we could add something like, e.g. CeedOperatorSetup, that is a no-op if the operator is set up but forces compilation of JiT kernels if they haven't been yet? That way, we could call that outside the log stage

@jeremylt
Copy link
Member

I think the big thing is figuring out what all we want to pre-compile and labeling this as a "you probably don't want this" function. CeedOperatorSetup is a mild collision with a function name in the backend API, so maybe CeedOperatorPrecompileKernels?

@zatkins-dev
Copy link
Collaborator Author

That name seems reasonable! And at least for gen operators, there's already a private function that does setup tasks, I don't know about shared and ref

@jeremylt
Copy link
Member

Note to me, gen would need to fall this for it's preconditioning fallback operator

@jedbrown
Copy link
Member

I think it shouldn't be necessary to recompile if only the restriction sizes have changed. We can recompile and exclude that cost for profiling or we could have an interface that allowed us to avoid unnecessary recompilation.

Mutation is a common source of logic errors. One idea would be to consume a CeedOperator (have the interface check that the refcount is 1), forwarding the previously-compiled kernels from the old CeedOperator to a new CeedOperator that must be equivalent except for the size/values of the restrictions.

@jeremylt
Copy link
Member

The big thing that would require recompilation would be the max number of points per element. I think there's some opportunities for optimization on the Ratel side that get much easier if we have the ability to consume and recreate a CeedOperator.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants