Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Ai24 4 raise context limit #9

Merged
merged 23 commits into from
Mar 26, 2025
Merged

Conversation

andychase
Copy link
Member

No description provided.

@andychase
Copy link
Member Author

I created this just to review it ; Mike said it's not ready yet

Copy link
Member Author

@andychase andychase left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good Mike! I left some notes to help out John as he reviews.

I wasn't able to trigger the nice warning message just by copy/pasting in large amounts of text though eventually I got the alert popup. Putting in large messages though it did display. Maybe it's because of the "every 8 character" rule I didn't trigger it just copy/pasting into the text area. I do like the visual style and placement of the warning message and how the conversation becomes orange in the conversation tree though I noticed some weird behavior with that (see below).

Maybe Maureen could think about the wording to use "approx. characters left in conversation context:81588 ? " is what is says now. Maybe "Warning: you are approaching the number of words this model is able to handle. Consider starting a new conversation". Would be better? Just thinking many people are not tech savvy. Maybe a different message if the initial message is the large one "This message is too large for this model".

Screenshot 2025-03-17 at 3 14 48 PM

Keep it up Mike!

@@ -84,9 +97,38 @@ export const ChatInput = ({
}

setContent(value);

// only run the token count every 8 characters since it slows down the display of what's typed
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder what @Scr1ptcat thinks but consider just using the number of characters instead of calculating the tokens. Although the tokens is more accurate; given that this message is more of a warning and also that the server-side token counting may not match this "tiktok token" counting library anyway, I am not sure getting the 100% precise amount is needed.

I think as a heuristic 4 characters per token is used as you noted elsewhere.

Our GFE is quite slow and bogged down so even a little big of lagginess while people are typing would infuriate people I think.

I also think this would simplify the codebase a little bit.

handleChange(event);
}, [selectedConversation]);

//useEffect(() => {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider removing commented out code

@@ -6,6 +6,14 @@ import tiktokenModel from '@dqbd/tiktoken/encoders/cl100k_base.json';
import { NextApiRequest, NextApiResponse } from 'next';
import { Tiktoken } from '@dqbd/tiktoken';

export const config = {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this change!

@@ -18,7 +18,7 @@ export const OpenAIModels: Record<OpenAIModelID, OpenAIModel> = {
[OpenAIModelID.GPT_4]: {
id: OpenAIModelID.GPT_4,
name: 'GPT-4',
maxLength: 128_000*3,
maxLength: 128_000 * 4,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Originally they were magic numbers and it appeared to be 3x the tokenLimit. Not sure why Mckay (OG author) chose 3 instead of 4x as the character upper bound.

"version": "1.0.14",
"resolved": "https://registry.npmjs.org/@dqbd/tiktoken/-/tiktoken-1.0.14.tgz",
"integrity": "sha512-R+Z1cVYOc8ZoDls6T2YhlUYrwKyuZoRJsSK3vN7iWWjBJ1xoX7e5BhUkEh5n6cXuMWQVUTHLlSDpnyv0Ye7xxw=="
"version": "1.0.20",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fyi @Scr1ptcat I did see an error trying this patch but npm i updating the dependancies resolved it

{ (isPastCharacterCount || isHighCharacterCount) && (
<span className="helpCircle" title="Once past the context limit, the conversation will no longer produce responses relevant to content before the limit">&nbsp;&nbsp;?&nbsp;&nbsp;</span>
)}
</div>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Screenshot 2025-03-17 at 3 06 25 PM

FYI here's how it appears.

@andychase
Copy link
Member Author

I wasn't able to trigger the nice warning message just by copy/pasting in large amounts of text though eventually I got the alert popup

I was able to get it now testing further, I just needed to type in text to trigger the once every 8 logic. Mike and I discussed using the throttling function from lodash to resolve this.

@andychase
Copy link
Member Author

I just reverted the local dev and file upload changes so this would be a clean patch/branch with this feature and then those reverts can be reverted again and tested after this patch! They are valuable changes.

@andychase andychase marked this pull request as ready for review March 26, 2025 23:33
@andychase
Copy link
Member Author

Looks good to me

@andychase andychase merged commit efea71a into main Mar 26, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants