Skip to content

rt-app: Allow to set only nice or runtime value for SCHED_OTHER/BATCH…#149

Open
deggeman wants to merge 1 commit into
scheduler-tools:masterfrom
deggeman:indiv_setting_nice_runtime_for_normal_batch
Open

rt-app: Allow to set only nice or runtime value for SCHED_OTHER/BATCH…#149
deggeman wants to merge 1 commit into
scheduler-tools:masterfrom
deggeman:indiv_setting_nice_runtime_for_normal_batch

Conversation

@deggeman
Copy link
Copy Markdown
Contributor

… tasks

sa_params.sched_flags = SCHED_FLAG_KEEP_PARAMS can't be used to keep the current se.slice value for a task in case it is not specified in its rt-app profile since in this case the nice value which might be in the profile gets ignored too:

Kernel-side sched_setattr retrieves the parameters from the task in case SCHED_FLAG_KEEP_PARAMS is set:

SYSCALL_DEFINE3(sched_setattr, ...) does:

if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
get_params(p, &attr, 0);

So in case either sched_data's prio or runtime is not set, retrieve the task parameters from the kernel to be able to set the value in case it is not specified in sched_data.

Make sure that the sched_data->prio == THREAD_PRIORITY_UNCHANGED can reach __set_thread_sched_other_attrs().

@deggeman
Copy link
Copy Markdown
Contributor Author

deggeman commented Apr 30, 2026

Only lightly tested with those rt-app json files on my Arm64 VM:

`---

{
"tasks": {
"t0": {
"loop" : -1,
"run" : 200,
"priority" : -10,
},
},
"global": {
"calibration" : 14,
}
}


{
"tasks": {
"t0": {
"loop" : -1,
"run" : 200,
"dl-runtime" : 90000
}
},
"global": {
"calibration" : 14
}
}


{
"tasks": {
"t0": {
"loop" : -1,
"phases": {
"p0" : {
"run" : 100,
"dl-runtime" : 40000,
},
"p1" : {
"run" : 200,
"dl-runtime" : 50000,
},
},
}
},
"global": {
"calibration" : 14
}
}


{
"tasks": {
"t0": {
"loop" : -1,
"phases": {
"p0" : {
"run" : 100,
"priority" : -1,
},
"p1" : {
"run" : 200,
"priority" : -2,
},
},
}
},
"global": {
"calibration" : 14
}
}


{
"tasks": {
"t0": {
"loop" : -1,
"phases": {
"p0" : {
"run" : 100,
"dl-runtime" : 40000,
},
"p1" : {
"run" : 200,
"dl-runtime" : 50000,
},
},
}
},
"global": {
"calibration" : 14,
"default_policy": "SCHED_IDLE",
}
}`

@petretudor-arm
Copy link
Copy Markdown
Contributor

{
"tasks": {
"t0": {
"loop" : -1,
"phases": {
"p0" : {
"run" : 100,
"dl-runtime" : 40000,
},
"p1" : {
"run" : 200,
"dl-runtime" : 50000,
},
},
}
},
"global": {
"calibration" : 14
}
}

With this test on a arm64 qemu setup I get these logs at the start of each phase:

[rt-app] <debug> [0] setting scheduler SCHED_OTHER nice=2147483647 runtime=40000000
[rt-app] <debug> [0] setting scheduler SCHED_OTHER priority 2147483647

In this case the nice value is clamped at the end of sched_copy_attr() [1] so we don't get the system value. This is what I was trying to say on #148.

To confirm this I added

if (sched_getattr(0, &_sa_params, sizeof(_sa_params), 0) == -1)  {
        perror("sched_getattr: failed to get SCHED_OTHER attributes");
	        exit(EXIT_FAILURE);
}
log_debug("[%d] proceeding with nice=%d runtime=%llu", data->ind, _sa_params.sched_nice, _sa_params.sched_runtime);

right at the end of __set_thread_sched_other_attrs() and got

[rt-app] <debug> [0] setting scheduler SCHED_OTHER nice=2147483647 runtime=40000000
[rt-app] <debug> [0] proceeding with nice=19 runtime=40000000
[rt-app] <debug> [0] setting scheduler SCHED_OTHER priority 2147483647

@petretudor-arm
Copy link
Copy Markdown
Contributor

{
"tasks": {
"t0": {
"loop" : -1,
"run" : 200,
"dl-runtime" : 90000
}
},
"global": {
"calibration" : 14
}
}

With this test, THREAD_PRIORITY_UNCHANGED never reaches __set_thread_sched_other_attrs() because the lack of a phases object will create a single NULL phase sched_data in the thread's thread_data_t which gets ignored by thread_set_param():

		/* There is no "phases" object which means that thread and phase will
		 * use same scheduling parameters. But thread object looks for default
		 * value when parameters are not defined whereas phase doesn't.
		 * We remove phase's scheduling policy which is a subset of thread's one
		 */
		free(data->phases[0].sched_data);
		data->phases[0].sched_data = NULL;

A thread's default (initial) priority cannot be same because in parse_task_data() we have

	data->sched_data = parse_sched_data(obj, opts->policy);

and the opts->policy comes from parse_global which sets it to SCHED_OTHER, so again we don't get the system default for prio.

If you add

log_debug("[%d] set_thread_param() using policy=%d prio=%d runtime=%lu", data->ind, sched_data->policy, sched_data->prio, sched_data->runtime);

after the NULL check in set_thread_param() you can see the message only appears once in the logs, when setting the thread's initial scheduling parameters. And then the phases will not overwrite those with system values.

@deggeman
Copy link
Copy Markdown
Contributor Author

deggeman commented Apr 30, 2026

[rt-app] [0] setting scheduler SCHED_OTHER nice=2147483647 runtime=40000000
[rt-app] [0] setting scheduler SCHED_OTHER priority 2147483647

I think this is happening since we use INT_MAX forTHREAD_PRIORITY_UNCHANGED and not -1.

[rt-app] [0] proceeding with nice=19 runtime=40000000

This is a bug which can be fixed by:

@@ -984,7 +984,7 @@ static void __set_thread_sched_other_attrs(thread_data_t *data,
sa_params.sched_priority = __sched_priority(data, sched_data);

    /* In the CFS case, sched_data->prio is the NICE value. */
  •   if (sched_data->prio)
    
  •   if (sched_data->prio != THREAD_PRIORITY_UNCHANGED)
              sa_params.sched_nice = sched_data->prio;
      else
              sa_params.sched_nice = _sa_params.sched_nice;
    

it will be effort to use the retrieved prio for logging instead of INT_MAX or (-1). I don't want to set sched_data->prio to the retrieved value to let the logging show the actual value we used for resetting?

@petretudor-arm
Copy link
Copy Markdown
Contributor

if (sched_data->prio != THREAD_PRIORITY_UNCHANGED)
sa_params.sched_nice = sched_data->prio;
else
sa_params.sched_nice = _sa_params.sched_nice;

Which is what I suggested in #148 (comment).

But even then, we only get this behaviour on fair tasks when there is a phases object present. Otherwise it's the current behaviour where you get rt-app's default attributes, since THREAD_PRIORITY_UNCHANGED won't make it to the phase's sched_data. I think this would be confusing for some users.

@deggeman
Copy link
Copy Markdown
Contributor Author

deggeman commented Apr 30, 2026

True, you're right about the issue of def_prio for def_policy in the phase-less setup. I'm trying:

        case same:
        case other:
        case batch:
        case idle:
               prior_def = THREAD_PRIORITY_UNCHANGED;
                break;
        case fifo:```
       
but I get a segfault right now. Need more time to resolve this ...

BTW, how does it work with the system settings on those tasks rt-app creates. Which OS infrastructural bits can set nice and slicelen between task spawning and rt-app doing sched_setattr() on the task?

@deggeman
Copy link
Copy Markdown
Contributor Author

How do I get meaningful code snippets into those comment fields ???

Comment thread src/rt-app.c Outdated
policy_to_string(sched_data->policy), sched_data->prio,
sched_data->runtime);

if (sched_data->prio != THREAD_PRIORITY_UNCHANGED || !sched_data->runtime) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the condition here should be sched_data->prio == THREAD_PRIORITY_UNCHANGED otherwise sched_getattr() is skipped.

@petretudor-arm
Copy link
Copy Markdown
Contributor

How do I get meaningful code snippets into those comment fields ???

Markdown syntax for code blocks:

```
your code here
```

It also supports language syntax highlighting. I was using

```C
my code here
```

@deggeman
Copy link
Copy Markdown
Contributor Author

deggeman commented Apr 30, 2026

Ah, nice! If you press the Code button, it gives me ` ` instead of ``` ```. Thanks!

@petretudor-arm
Copy link
Copy Markdown
Contributor

BTW, how does it work with the system settings on those tasks rt-app creates. Which OS infrastructural bits can set nice and slicelen between task spawning and rt-app doing sched_setattr() on the task?

From my understanding it is dup_task_struct() [1], then cgroup_fork() [2] and finally sched_fork() [3] in copy_process() [4], since pthread_create() calls clone3/clone. sched_fork() also applies SCHED_RESET_ON_FORK, which resets slicelen too.

@deggeman deggeman force-pushed the indiv_setting_nice_runtime_for_normal_batch branch from 8aaa5a8 to 08d4550 Compare May 1, 2026 13:20
@deggeman
Copy link
Copy Markdown
Contributor Author

deggeman commented May 4, 2026

BTW, how does it work with the system settings on those tasks rt-app creates. Which OS infrastructural bits can set nice and slicelen between task spawning and rt-app doing sched_setattr() on the task?

From my understanding it is dup_task_struct() [1], then cgroup_fork() [2] and finally sched_fork() [3] in copy_process() [4], since pthread_create() calls clone3/clone. sched_fork() also applies SCHED_RESET_ON_FORK, which resets slicelen too.

But we always call pthread_create() and then sched_setattr()? Vincent talked about a "system manager" which could have set a specific runtime for the task, which I assume is a different one than the default se.slice?

E.g. with systemd you can put systemd slice (cgroup placement) and the nice value (scheduler priority) directly in a service unit file:

[Service]
Slice=my-custom.slice
Nice=-5

@vingu-linaro
Copy link
Copy Markdown
Member

I faced this case while testing schedqos which sets nice and slice duration.
I'm going to run some tests with this patch

Comment thread src/rt-app_parse_config.c Outdated
log_debug(PIN "key: set scheduler %d with priority %d", data->policy, data->prio);

return NULL;
return data;
Copy link
Copy Markdown
Member

@vingu-linaro vingu-linaro May 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you always return a sched_data_t even if there is no change. We have a mechanism to skip set_thread_param when there is not change between 2 phases

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is not necessary. I just have to make sure that I create a sched_data_t for the task in case we have data->sched_data != NULL for the task in case. At least there is a :

/* Set scheduling policy and print pretty info on stdout */
log_notice("[%d] Starting with %s policy with priority %d",
                   data->ind, policy_to_string(data->sched_data->policy),
                   data->sched_data->prio);

in thread_body().
Let me change this back and fix this differently in the next version.

Comment thread src/rt-app_parse_config.c Outdated
new_data = malloc(sizeof(sched_data_t));
memcpy( new_data, &tmp_data,sizeof(sched_data_t));
data = malloc(sizeof(sched_data_t));
memcpy(data, &tmp_data,sizeof(sched_data_t));
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a null check here? memcpy with a null destination pointer has UB.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. But I would like to avoid this fix in this patch-set since there are other places with the same issue. And this has nothing to do with the actual problem this patch should fix.

@deggeman
Copy link
Copy Markdown
Contributor Author

deggeman commented May 7, 2026

I faced this case while testing schedqos which sets nice and slice duration. I'm going to run some tests with this patch

Ah, OK, and schedqos uses netlink to monitor task activities like FORK & EXEC and can then apply QoS params like nice and slicelen.

Comment thread src/rt-app.c Outdated

/* In the CFS case, sched_data->prio is the NICE value. */
sa_params.sched_nice = sched_data->prio;
if (sched_data->prio)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  •   if (sched_data->prio)
    
  •   if (sched_data->prio != THREAD_PRIORITY_UNCHANGED)
    

Comment thread src/rt-app_parse_config.c Outdated
new_data = malloc(sizeof(sched_data_t));
memcpy( new_data, &tmp_data,sizeof(sched_data_t));
data = malloc(sizeof(sched_data_t));
memcpy(data, &tmp_data,sizeof(sched_data_t));
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. But I would like to avoid this fix in this patch-set since there are other places with the same issue. And this has nothing to do with the actual problem this patch should fix.

Comment thread src/rt-app_parse_config.c Outdated
log_debug(PIN "key: set scheduler %d with priority %d", data->policy, data->prio);

return NULL;
return data;
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, this is not necessary. I just have to make sure that I create a sched_data_t for the task in case we have data->sched_data != NULL for the task in case. At least there is a :

/* Set scheduling policy and print pretty info on stdout */
log_notice("[%d] Starting with %s policy with priority %d",
                   data->ind, policy_to_string(data->sched_data->policy),
                   data->sched_data->prio);

in thread_body().
Let me change this back and fix this differently in the next version.

… tasks

sa_params.sched_flags = SCHED_FLAG_KEEP_PARAMS can't be used to keep the
current se.slice value for a task in case it is not specified in its
rt-app profile since in this case the nice value which might be in the
profile gets ignored too:

Kernel-side sched_setattr retrieves the parameters from the task in case
SCHED_FLAG_KEEP_PARAMS is set:

  SYSCALL_DEFINE3(sched_setattr, ...) does:

  if (attr.sched_flags & SCHED_FLAG_KEEP_PARAMS)
    get_params(p, &attr, 0);

So in case either sched_data's prio or runtime is not set, retrieve the
task parameters from the kernel to be able to set the value in case it
is not specified in sched_data.

Make sure that 'sched_data->prio == THREAD_PRIORITY_UNCHANGED' can
reach __set_thread_sched_other_attrs().

To enable this for phase-less (i.e. only task data) configs too set the
default priority of SCHED_OTHER/BATCH/IDLE tasks to
THREAD_PRIORITY_UNCHANGED in parse_sched_data().
But this means we now have to create a task sched_data_t (def_policy !=
0) as well in case all other scheduler parameters (prio (nice), runtime,
period, deadline, util_min, util_max) are not specified in the task
profile.

Signed-off-by: Dietmar Eggemann <dietmar.eggemann@arm.com>
@deggeman deggeman force-pushed the indiv_setting_nice_runtime_for_normal_batch branch from 08d4550 to dd8ac10 Compare May 8, 2026 13:21
@deggeman
Copy link
Copy Markdown
Contributor Author

deggeman commented May 8, 2026

testcases ran on latest version:

(~/scripts) e125579 $ ./test_rt-app_sched_setattr.sh
test_sched_setattr_01 passed
test_sched_setattr_02 passed
test_sched_setattr_03 passed
test_sched_setattr_04 passed
test_sched_setattr_05 passed
test_sched_setattr_06 passed
test_sched_setattr_07 passed
test_sched_setattr_08 passed
test_sched_setattr_09 passed
test_sched_setattr_10 passed

(~/scripts) e125579 $ ./test_rt-app_taskgroups_v1.sh
test_taskgroup_v1_01 passed
test_taskgroup_v1_02 passed
test_taskgroup_v1_03 passed
test_taskgroup_v1_04 passed
test_taskgroup_v1_05 passed
test_taskgroup_v1_06 passed
test_taskgroup_v1_07 passed
test_taskgroup_v1_11 passed
test_taskgroup_v1_12 passed
test_taskgroup_v1_13 passed
test_taskgroup_v1_14 passed
test_taskgroup_v1_15 passed
test_taskgroup_v1_16 passed
test_taskgroup_v1_21 passed
test_taskgroup_v1_22 passed
test_taskgroup_v1_23 passed
test_taskgroup_v1_24 passed
test_taskgroup_v1_25 passed
test_taskgroup_v1_26 passed
test_taskgroup_v1_31 passed
test_taskgroup_v1_32 passed
test_taskgroup_v1_33 passed
test_taskgroup_v1_34 passed
test_taskgroup_v1_35 passed
test_taskgroup_v1_36 passed
test_taskgroup_v1_37 passed
test_taskgroup_v1_38 passed
test_taskgroup_v1_41 passed
test_taskgroup_v1_42 passed
test_taskgroup_v1_43 passed
test_taskgroup_v1_44 passed
test_taskgroup_v1_51 passed
test_taskgroup_v1_52 passed
test_taskgroup_v1_53 passed
test_taskgroup_v1_54 passed
test_taskgroup_v1_61 passed
test_taskgroup_v1_62 passed
test_taskgroup_v1_63 passed
test_taskgroup_v1_64 passed
test_taskgroup_v1_65 passed
test_taskgroup_v1_66 passed
test_taskgroup_v1_67 passed
test_taskgroup_v1_68 passed
test_taskgroup_v1_71 passed
test_taskgroup_v1_72 passed
test_taskgroup_v1_73 passed
test_taskgroup_v1_74 passed
test_taskgroup_v1_75 passed
test_taskgroup_v1_76 passed

indiv_setting_nice_runtime_for_normal_batch_test.tgz

@vingu-linaro
Copy link
Copy Markdown
Member

@deggeman Thanks for the update, I will review your latest version beg of next week

@vingu-linaro
Copy link
Copy Markdown
Member

I have the log below when I don't set a prio. The end result is ok because there is no scheduler change but the log is weird with 2147483647 for THREAD_PRIORITY_UNCHANGED :

[rt-app] [0] Starting with SCHED_OTHER policy with priority 2147483647
[rt-app] [0] setting scheduler SCHED_OTHER priority 2147483647

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants