Skip to content

Reset HttpCacheSM on following redirect #12182

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 14, 2025

Conversation

masaori335
Copy link
Contributor

We face an issue that http.current_active_client_connections metrics keep increasing. After investigation with @cmcfarlen and @moonchen, we found HttpSM hangs on state_cache_open_read when it follows redirect and HttpCacheSM::captive_action was cancelled. Because HttpCacheSM::state_cache_open_read does nothing.

STATE_ENTER(&HttpCacheSM::state_cache_open_read, event);
ink_assert(captive_action.cancelled == 0);
pending_action = nullptr;
if (captive_action.cancelled == 1) {
return VC_EVENT_CONT; // SM gave up on us
}

To fix the issue, this PR does reset captive_action.cancelled and other counters of HttpCacheSM on following redirect.

From our testing, this patch seems working, however, captive_action and pending_action related code seems pretty odd. They should be fixed.

@masaori335 masaori335 added this to the 10.2.0 milestone Apr 11, 2025
@masaori335 masaori335 self-assigned this Apr 11, 2025
@JosiahWI
Copy link
Contributor

Would also be interesting to know whether this affects #11700.

Copy link
Contributor

@JosiahWI JosiahWI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice fix. I believe I found this bug during testing earlier, but I did not have time to confirm it or track down what caused it.

I agree we need to look at how captive_action and pending_action are handled... I think other bugs are probably also related to it.

@masaori335 masaori335 merged commit 7eb355f into apache:master Apr 14, 2025
15 checks passed
@github-project-automation github-project-automation bot moved this to For v10.1.0 in ATS v10.1.x Apr 14, 2025
@masaori335
Copy link
Contributor Author

For the record, the issue happens only on 10.0.x and master. Because we reverted commits from 9.2.x branch.
#9338

@cmcfarlen cmcfarlen moved this from For v10.1.0 to picked v10.1.0 in ATS v10.1.x Apr 15, 2025
@cmcfarlen cmcfarlen modified the milestones: 10.2.0, 10.1.0 Apr 15, 2025
@cmcfarlen
Copy link
Contributor

Cherry-picked to 10.1.x branch

cmcfarlen pushed a commit that referenced this pull request Apr 15, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: picked v10.1.0
Development

Successfully merging this pull request may close these issues.

3 participants