Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

torchtune dry-run feature request #2453

Open
agunapal opened this issue Mar 3, 2025 · 6 comments
Open

torchtune dry-run feature request #2453

agunapal opened this issue Mar 3, 2025 · 6 comments
Labels
community help wanted We would love the community's help completing this issue enhancement New feature or request triage review This issue should be discussed in weekly review

Comments

@agunapal
Copy link

agunapal commented Mar 3, 2025

New feature request

I know torchtune does some validation checks to ensure the prompt is not malformed.

But we have no way of knowing if SFT is happening with the right input/output format.

What would be great is to have a torchtune dry-run command, which does the following

  • It will take the first row of the dataset , show the string of input/output being sent to the loss function
  • It also shows the tokenized version of the above.

This way one can visually inspect and be sure that torchtune has been configured correctly for fine-tuning.

@init27
Copy link

init27 commented Mar 3, 2025

Great minds chai alike 😁 #2452

@felipemello1 felipemello1 added enhancement New feature or request triage review This issue should be discussed in weekly review labels Mar 4, 2025
@felipemello1
Copy link
Contributor

felipemello1 commented Mar 4, 2025

Hey @agunapal , thats a good request! We do not have bandwidth to look into at this moment, but if you want to propose an RFC (PR with high level ideas on how to implement it) we could review it.

As a sanity check, you could clone the recipe and modify the training loop to inspect the model inputs + tokenizer.decode. Would that work for you?

@felipemello1
Copy link
Contributor

felipemello1 commented Mar 4, 2025

just saw your comment @init27 ! I will close your issue, since i have replied here already, so we can consolidate the conversation. I will check if someone from the community has interest in picking it up, since both of you are interested.

@felipemello1
Copy link
Contributor

felipemello1 commented Mar 4, 2025

Just so I can understand the request better: You guys want to do a full epoch on the dataloader, but without training, to see if there are dataset issues. And you want to be able to possibly print or store the input/output of that dataloader, to confirm that they look like they should. Is that it, or is there something else?

@agunapal @init27

@init27
Copy link

init27 commented Mar 4, 2025

Yes exactly, thanks for confirming!

The idea is instead of the training loop crashing on us mid-loop, have a method to validate all message examples.

For an example:

Right now im using synthetic conversations, sometimes these have duplicate assistant messages which go unnoticed so this crashes mid-training. Having a way to run the test cases 'offline'/before FT would be really great!

@agunapal
Copy link
Author

agunapal commented Mar 4, 2025

@felipemello1 Yes, for my use case, I am using a custom prompt template, a custom dataset and llama-guard (the tokenizer is slightly different). I want to make sure that the model is getting the current input and its not a case of garbage in, garbage out. Currently I am adding prints to get around this. It would be nice to have this utility to visual inspect the final prompt and the corresponding tokens.

@felipemello1 felipemello1 added the community help wanted We would love the community's help completing this issue label Mar 4, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community help wanted We would love the community's help completing this issue enhancement New feature or request triage review This issue should be discussed in weekly review
Projects
None yet
Development

No branches or pull requests

3 participants