McGill NLP

All

47 repositories

constituent-movement
Public
Repo for "Language Models Largely Exhibit Human-like Constituent Ordering Preferences"
Python
•1•1•0•0•Updated Apr 26, 2025Apr 26, 2025
weblinx-browsergym
Public
Python
•2•1•0•0•Updated Apr 25, 2025Apr 25, 2025
mcgill-nlp.github.io
Public
Python
•21•0•7•0•Updated Apr 24, 2025Apr 24, 2025
safearena
Public
SafeArena is a benchmark for assessing the harmful capabilities of web agents
Python
•2•15•0•0•Updated Apr 23, 2025Apr 23, 2025
unequal-unlearning
Public
Python
•0•2•2•0•Updated Apr 16, 2025Apr 16, 2025
agent-reward-bench
Public
AgentRewardBench: Evaluating Automatic Evaluations of Web Agent Trajectories
Python
•0•11•1•0•Updated Apr 15, 2025Apr 15, 2025
thoughtology
Public
MIT License
•2•8•0•0•Updated Apr 11, 2025Apr 11, 2025
nano-aha-moment
Public
Single File, Single GPU, From Scratch, Efficient, Full Parameter Tuning library for "RL for LLMs"
Jupyter Notebook
•
MIT License
•34•424•1•1•Updated Apr 8, 2025Apr 8, 2025
AfroBench
Public
Large Scale Benchmark of Large Language Models on African Languages
Python
•0•0•0•0•Updated Apr 7, 2025Apr 7, 2025
malicious-ir
Public
Code for `Exploiting Instruction-Following Retrievers for Malicious Information Retrieval`
retrieval safety instruction-following-retrieval malicious-retrieval
Python
•
MIT License
•1•6•0•0•Updated Apr 1, 2025Apr 1, 2025
project-page-template
Public template
Template for creating project webpages based on jekyll/minimal-mistakes
1•1•0•0•Updated Mar 13, 2025Mar 13, 2025
tiny-aha-moment-length-budget
Public
Python
•34•0•0•0•Updated Mar 11, 2025Mar 11, 2025
CHASE
Public
Synthetic Data Generation for Evaluation
Python
•
MIT License
•4•12•0•0•Updated Feb 21, 2025Feb 21, 2025
Injongo
Public
A multicultural, open-source benchmark dataset for 16 African languages with utterances generated by native speakers across diverse domains.
Jupyter Notebook
•
GNU General Public License v3.0
•0•0•0•0•Updated Feb 12, 2025Feb 12, 2025
weblinx
Public
WebLINX is a benchmark for building web navigation agents with conversational capabilities
nlp agent web computer-vision navigation agents multimodal llm
Python
•
Apache License 2.0
•16•146•0•0•Updated Feb 11, 2025Feb 11, 2025
Naija-representation-in-LLMs
Public
Evaluation dataset for our NAACL 2025 paper on "Does Generative AI speak Nigerian-Pidgin?: Issues about Representativeness and Bias for Multilingualism in LLMs"
Apache License 2.0
•0•0•0•0•Updated Feb 4, 2025Feb 4, 2025
llm2vec
Public
Code for 'LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders'
Python
•
MIT License
•120•1.5k•31•4•Updated Jan 24, 2025Jan 24, 2025
AURORA
Public
Code and data for the paper: Learning Action and Reasoning-Centric Image Editing from Videos and Simulation
Python
•
MIT License
•2•28•0•0•Updated Jan 14, 2025Jan 14, 2025
bias-bench
Public
ACL 2022: An Empirical Survey of the Effectiveness of Debiasing Techniques for Pre-trained Language Models.
Python
•43•137•0•0•Updated Dec 16, 2024Dec 16, 2024
webllama
Public
Llama-3 agents that can browse the web by following instructions and talking to you
Python
•
MIT License
•108•1.4k•2•0•Updated Dec 10, 2024Dec 10, 2024
VinePPO
Public
Code for the paper "VinePPO: Unlocking RL Potential For LLM Reasoning Through Refined Credit Assignment"
Python
•
MIT License
•15•151•2•2•Updated Nov 11, 2024Nov 11, 2024
statcan-dialogue-dataset
Public
The StatCan Dialogue Dataset: Retrieving Data Tables through Conversations with Genuine Intents
Python
•2•9•0•0•Updated Nov 6, 2024Nov 6, 2024
incontext-code-generation
Public
NAACL 2024: Evaluating In-Context Learning of Libraries for Code Generation
Python
•1•6•0•0•Updated Oct 23, 2024Oct 23, 2024
instruct-qa
Public
Code and Data for "Evaluating Correctness and Faithfulness of Instruction-Following Models for Question Answering"
Python
•
Apache License 2.0
•5•83•5•1•Updated Aug 12, 2024Aug 12, 2024
scope-ambiguity
Public
Code and data for the paper 'Scope Ambiguities in Large Language Models'.
Python
•
MIT License
•2•5•0•0•Updated Jun 25, 2024Jun 25, 2024
AdversarialTriggers
Public
Code for "Universal Adversarial Triggers Are Not Universal."
Python
•
MIT License
•2•17•0•0•Updated May 2, 2024May 2, 2024
length-generalization
Public
Code for the paper "The Impact of Positional Encoding on Length Generalization in Transformers", NeurIPS 2023
Python
•
MIT License
•7•135•3•0•Updated Apr 30, 2024Apr 30, 2024
barbados-workshop-2024
Public
HTML
•0•0•0•0•Updated Apr 9, 2024Apr 9, 2024
MAGNIFICo
Public
EMNLP 2023: MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language Models to Generalize to Novel Interpretations
Python
•0•1•0•0•Updated Mar 17, 2024Mar 17, 2024
diffusion-itm
Public
Code and data setup for the paper "Are Diffusion Models Vision-and-language Reasoners?"
Python
•1•32•0•0•Updated Mar 15, 2024Mar 15, 2024