Skip to content

Conversation

@AlliBalliBaba
Copy link
Contributor

This is a suggestion for #1915, better integration into existing library options and differentiation between messages and requests.

@AlliBalliBaba
Copy link
Contributor Author

AlliBalliBaba commented Oct 18, 2025

Made this PR just to explain what I meant with exposing a struct. The interface feels very brittle since it has so many methods.

This way there is just a Worker struct that is initialized with WorkerOptions and exposes only:

func (Worker) SendRequest(http.ResponseWriter, http.Request)
any = func (Worker) SendMessage(any, http.ResponseWriter)

The Send functions being blocking feels more in line with how go usually operates, the in-between goroutine that pipes the request is largely unnecessary unless I'm missing something.

@dunglas
Copy link
Member

dunglas commented Oct 18, 2025

This looks better than the current implementation to me. I would like to have @withinboredom opinion on this, and try to migrate the queue and the gRPC extension to be sure we don't miss something.

@withinboredom
Copy link
Member

I’m not really a fan of this because it accidentally encourages people to close over hidden state. Considering that people might be new to Go, coming from PHP, they may not realise they’re creating an unmaintainable mess.

Example of hidden state:

hiddenCounter := 0
opts.WithWorkerOnReady(func() {
  hiddenCounter += 1
})

But you could imagine that hiddenCounter could be some other state.

Instead, it might be better to make the interface smaller and break it up:

type WorkerLifecycle interface {
    OnReady(threadID int)
    OnShutdown(threadID int)
    OnServerShutdown(threadID int)
}

And then remove those from the worker interface. From there, we only have to notify if the Worker object implements WorkerLifecycle.

Interfaces in Go are for the receiver not for the sender (opposite of PHP). So, we can make them as fine-grained or coarse as we want them. Either the sender implements it, or not. (this is why many style guides for Go say something along the lines of: "always receive an interface and return a struct".

@AlliBalliBaba
Copy link
Contributor Author

AlliBalliBaba commented Oct 18, 2025

Yeah you're right, the functions can be inline, in which case they include the scope.
Would you also be fine with avoiding that like this:

type Livecycle interface {
    Handle(threadID int)
}

func WithWorkerOnReady(Livecycle) WorkerOption
func WithWorkerOnShutdown(Livecycle) WorkerOption
func WithWorkerOnServerShutdown(Livecycle) WorkerOption

On a sidenote: I'd be much more concerned about the direct access to PHP's memory allocation in these hooks since they run on the C threads

@dunglas
Copy link
Member

dunglas commented Oct 19, 2025

Being fluent in Go is a prerequisite to write FrankenPHP extensions.
I would like to keep the public API as simple and maintainable as possible

@dunglas
Copy link
Member

dunglas commented Oct 20, 2025

@withinboredom @AlliBalliBaba could we organize a video call in the next few days to find an agreement about what to do?

I would like to tag a new release soon.

@AlliBalliBaba
Copy link
Contributor Author

👍 available most of the week, just not tomorrow evening.

@withinboredom
Copy link
Member

I'm on Discord (#1634) in the evenings.

}
}
// EXPERIMENTAL: SendRequest sends an HTTP request to the worker and writes the response to the provided ResponseWriter.
func (w Worker) SendRequest(rw http.ResponseWriter, r *http.Request) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we mean to make the receiver here a non-pointer? I'm not 100% sure, but this would mean (potentially) a copy needs to be made every time a message is sent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it won't need to make a copy, the compiler should be smart enough in that case. The struct would only be copied if it's modified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm 99% sure Go doesn't have COW semantics. In any case, the struct header will def be copied each time these are called, and put on the stack. Letting it live in the heap would likely be kept in an L1/L2 cache. If we can kick out the options slice (make it a part of registration), it can fit safely into a cache line and basically be "free" (and not even matter which type of receiver we use).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Anyway, that's a micro-optimisation (though this is on a hot-path). I think I have a good idea here, I'll create a PR to this PR -- but I do need to leave for a date in the next hour. I think this is fine for now, so it isn't a blocker from merging.

Copy link
Contributor Author

@AlliBalliBaba AlliBalliBaba Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hard to guess what the CPU will actually do here without benchmarks, but I'm fine with making this a pointer either way, I'll wait for your PR👍

w.activatedCount.Add(1)
}
// EXPERIMENTAL: SendMessage sends a message to the worker and waits for a response.
func (w Worker) SendMessage(message any, rw http.ResponseWriter) (any, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here.

fileName: fileName,
num: num,
options: options,
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to add back some of the original methods from the original interface:

  • Name()
  • NumThreads()
  • Options()?
  • FileName()

Now it is basically an opaque token for the extension (they can't access these fields) and they have to keep track of the values they passed themselves which just increases memory usage.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, I'd rather keep the api minimal. Methods like these can be added anytime afterwards if necessary without breaking the api.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s an actual use case:

I’m working on a NATs Jetstream client, which has headers/bodies/etc -- so I need requests, not messages. There are also different types of workers (key/value change workers and stream workers). These details are needed after initialisation.

I can’t assume that the number of threads is the actual number of threads I requested. I need access to that value to ensure I initialise the appropriate number of consumers.

I can’t assume that the worker name is the same worker name I requested either, and I’d like my metrics to match frankenphp’s without hardcoding things.

Being that I know how FrankenPHP works, I can kinda cheat, but I’m trying to also work out "best practices" instead of cheating.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are returning by value, we can just export the fields, maybe?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, we actually won't be able to see the worker configuration until after registration, but since it is a value, it won't have the most updated information.

maybe this is a non-issue for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm in that case NumThreads() should return the actual number of running threads, not the num that was passed on startup. name and fileName shouldn't be modified after startup, so they probably shouldn't be public.

That also makes me think: wouldn't the extension workers currently drain threads from the general threadpool? So if someone uses an extension, all their threads might be unknowingly consumed by the extension. Seems kinda messy, the extension threads should probably be counted on-top of the configured num_threads.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That also makes me think: wouldn't the extension workers currently drain threads from the general threadpool?

Heh, yeah, that's why the comment said: "don't be greedy". I would expect extension developers using this would have a way to define how many workers to consume, and it is up to the operator to do a bit of math to ensure there are either enough threads configured or enough CPUs allocated.

the extension threads should probably be counted on-top of the configured num_threads.

I don't think it is a good idea to try and be smart about this. Setting the number of threads in the caddyfile isn't hard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm we can leave it as-is for now, maybe extensions will make that configurable anyways.

It might also make sense in the future to have a per-worker max_threads, since currently any worker can potentially consume all threads on scaling.

fc.responseWriter = rw
fc.handlerParameters = message

internalWorker.handleRequest(fc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, there was a way to signal to the external worker that we were ready to receive requests, but now it only applies if the user knew to pass an option in. Can we safely handle the case where we start receiving messages/requests before the worker is ready? If I understand correctly, this would just end up forwarding to a non-worker thread?

Copy link
Contributor Author

@AlliBalliBaba AlliBalliBaba Oct 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah you mean that ServeRequest() can be called prematurely? It's actually not enough to just wait for workers to be ready, Init() has to be completely finished to guarantee no race conditions.

I think we might need a OnServerReady() hook instead (that fires once), since the extension might not have control over Init() unlike a library user.

@dunglas
Copy link
Member

dunglas commented Oct 28, 2025

Can I merge this one into main? I would like to tag a new release soon.

Thanks for the hard word btw!

Copy link
Member

@withinboredom withinboredom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dunglas this looks like it is in a stable state? I think we've ironed out most of the weirdness. I haven't specifically tested the branch for the last few commits, but I don't see anything wrong with merging it. We can always bug fix anything major. It is a breaking change for all external worker extensions. So, we def need to call it out in the release notes -- but since it is currently marked experiemental, I think that is fine.

@AlliBalliBaba
Copy link
Contributor Author

Should I pass the worker by reference or do you still plan a separate PR @withinboredom ?

func (w *Worker)

Otherwise you can merge

@withinboredom
Copy link
Member

I'll do a separate PR!

@AlliBalliBaba
Copy link
Contributor Author

Let's merge then! (into refactor/external-worker-api)

@AlliBalliBaba AlliBalliBaba merged commit 31192ae into refactor/external-worker-api Oct 28, 2025
@AlliBalliBaba AlliBalliBaba deleted the refactor/external-worker-apis-2 branch October 28, 2025 19:37
dunglas pushed a commit that referenced this pull request Oct 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants