-
Notifications
You must be signed in to change notification settings - Fork 350
WIP: Massive pacemaker-based refactors #4011
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
deaa1a7 to
bc1dd5d
Compare
852e402 to
5241abc
Compare
3416148 to
2dd98b1
Compare
daemons/based/based_remote.c
Outdated
| * reference to the client's \c GSource has been removed. There is likely | ||
| * only one reference when we call this function, and thus the client is | ||
| * likely to be freed before we return. The current GLib code looks as if | ||
| * this is always the case. However, that is a GLib implementation detail. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
g_main_context_dispatch() dispatches all pending sources in a batch before moving to the next main loop iteration. If we're running this function, then we're running it as a result of a particular source being dispatched. So I think there may be another reference to the source we're removing if it's also being dispatched as part of this batch.
So I'll weaken this assumption. But I still don't think we need to wait for remote clients to be destroyed by the callback before dropping clients. (See a couple of commits ahead of here.)
Avoid using the public API prefix. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
...to based_common_callback_worker(), to avoid using public API prefix. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Avoid using the public API prefix. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This will probably get renamed and/or split up further when we implement IPC/message-related best practices in pacemaker-based. However, for now, simply make the header name match the .c file name. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Make based_process_request() a bit more legible. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And change/add some variable names. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
There are still plenty of IPC-specific pieces in based_callbacks.c. However, this pulls out all of the libqb IPC callbacks, which can be easily isolated. Pulling out the rest of the IPC-specific code will require more refactoring. This is based on commit 24d4455 and similar ones for other daemons. Ref T566 Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Also drop the unused size argument from dispatch_common(). This is based on commit 4f8e790 and similar ones for other daemons. Ref T566 Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
It doesn't make sense to have this handful of unrelated initializations in a separate function, when most of main() is for initializing things. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Place the call right before the first thing that needs it (based_activate_cib()). The rest of the initialization helpers are called in this region, so let's collect them. We can't make them all contiguous though -- based_remote_init() requires that the CIB be read and activated first. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To mirror attrd and fenced. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
pcmk__corosync_connect() doesn't use the caches directly. pcmk__get_node() initializes the caches when it gets called. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Using include-what-you-use. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
It shouldn't matter whether this is a Corosync cluster or not. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To mirror attrd and fenced. Notes: * Previously, we weren't freeing the cluster object on exit in based_terminate(). Now we are. * pcmk_cluster_free() calls pcmk__cluster_destroy_node_caches(), which is why we drop the call to that function. * I'm fairly certain that the reason the pcmk_cluster_disconnect() call previously occurred before the done section, is that prior to a recent commit, we weren't NULL-checking the cluster argument before disconnect. We should be able to call based_cluster_disconnect() regardless of how based_terminate() wherever we want to free the cluster object. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And NULL-check the argument argument. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Save some lines. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
pcmk_cluster_disconnect() now returns EINVAL if passed a NULL argument. Previously, if given a NULL argument in a Corosync cluster, pcmk_cluster_disconnect() would call down to pcmk__cpg_disconnect(), which would dereference the NULL pointer. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To mirror attrd and the fencer. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And make it static. It seems logical to have it in the same file as the main() function. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This gets rid of the last use of based_cluster outside of based_corosync.c. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And rename to cluster. We don't typically care much about making variable names distinct, although we try to make function names distinct. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To replace cib_shutdown_flag. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To replace global stand_alone variable. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To drop a global. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
There were four allowed values: shared-mem, socket, posix, and sysv. Libqb removed support for posix and sysv (message queues) in 2012 via commit 70a9623a. It simply returns an error if you try to set up those types of IPC. I spoke with the libqb maintainers last night, and they said that socket IPC has never worked correctly and that they are considering removing it. Sockets were mainly intended for systems where shared memory was not available. That is believed to be no longer applicable -- shared memory should be available on all the platforms that Pacemaker and libqb target for support. That leaves only shared memory as a working option. We are dropping this option rather than deprecating it. As discussed above, two of the values (posix and sysv) will fail immediately during IPC setup, and socket is very buggy. So user-facing behavior should not be affected. If the option is configured, it will simply be ignored. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
QB_IPC_NATIVE tells libqb to use shared memory (QB_IPC_SHM) unless it's
shared-memory IPC is specifically disabled at build time because it's
unavailable on the system. In that case, libqb will use sockets
(QB_IPC_SOCKET).
I spoke with the libqb maintainers last night, and they said that socket
IPC has never worked correctly and that they are considering removing
it. They told me never to use libqb socket IPC.
Libqb socket IPC was mainly intended for systems where shared memory was
not available. That is believed to be no longer applicable -- shared
memory should be available on all the platforms that Pacemaker and libqb
target for support.
The maintainers weakly suggested using QB_IPC_NATIVE (rather than
QB_IPC_SHM). I suppose this was for future-proofing ("let libqb decide
what to use"). Nonetheless, since we have a couple of use cases where we
MUST avoid blocking, and QB_IPC_SHM is non-blocking, we'll just use
QB_IPC_SHM everywhere until we have a reason not to. Besides, this makes
things explicit.
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
...argument.
QB_IPC_NATIVE tells libqb to use shared memory (QB_IPC_SHM) unless
shared-memory IPC is specifically disabled at build time because it's
unavailable on the system. In that case, libqb will use sockets
(QB_IPC_SOCKET).
I spoke with the libqb maintainers last night, and they said that socket
IPC has never worked correctly and that they are considering removing
it. They told me never to use libqb socket IPC.
Libqb socket IPC was mainly intended for systems where shared memory was
not available. That is believed to be no longer applicable -- shared
memory should be available on all the platforms that Pacemaker and libqb
target for support.
The maintainers weakly suggested using QB_IPC_NATIVE (rather than
QB_IPC_SHM). I suppose this was for future-proofing ("let libqb decide
what to use"). Nonetheless, since we have a couple of use cases where we
MUST avoid blocking, and QB_IPC_SHM is non-blocking, we'll just use
QB_IPC_SHM everywhere until we have a reason not to.
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And make the mainloop variable static. No other code changes. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
There is still more work to do to consolidate/reorganize the based exit/shutdown-related code. I'm keeping the pieces small to try to make it easier to reason about. based_terminate() seems too complicated. Note that based_ipc_cleanup() calls pcmk__client_cleanup(), which is why we drop the call from main. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
There are five call sites. * based_cpg_destroy: This is called only via the main loop. See pcmk_cluster_set_destroy_fn() and pcmk__cpg_connect(). * based_peer_change_cb: We create and run the main loop immediately after connecting to the cluster (and thus registering this callback). We disconnect from the cluster immediately after returning from the main loop, and then we exit after freeing data structures. * based_process_shutdown: We can only reach this function through the main loop, when we receive a message from a client or a cluster peer. * based_shutdown: We can only reach this through the main loop. It's set up as a main loop signal handler via mainloop_add_signal(). The true signal handler is mainloop_signal_handler(), which sets a main loop trigger to call based_shutdown(). * based_force_exit: This is called only by a main loop timer created by based_shutdown(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Now that we assume the main loop is running, -1 behaves exactly the same as CRM_EX_OK. We call based_cluster_disconnect() almost immediately after returning from the main loop. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
No description provided.