Skip to content

Clarifies Factory vs FORGE action-space control semantics#6158

Open
0xadvait wants to merge 2 commits into
isaac-sim:developfrom
0xadvait:0xadvait/clarify-forge-ctrl-semantics
Open

Clarifies Factory vs FORGE action-space control semantics#6158
0xadvait wants to merge 2 commits into
isaac-sim:developfrom
0xadvait:0xadvait/clarify-forge-ctrl-semantics

Conversation

@0xadvait

Copy link
Copy Markdown

Description

The Factory and FORGE environments share the CtrlCfg fields pos_action_bounds / rot_action_bounds and pos_action_threshold / rot_action_threshold, but apply them in opposite roles, because the two families use different action spaces:

Factory (factory_env.py) FORGE (forge_env.py)
Action meaning displacement relative to current EE pose absolute target relative to fixed asset
pos_action_threshold scales the normalized action to a per-step displacement clips the per-step motion of the target w.r.t. the current EE pose
pos_action_bounds clips the target relative to the fixed asset (workspace bound) scales the normalized action onto the operational volume around the fixed asset

Read side by side, this looks like the two variables were accidentally swapped in ForgeEnv (see #5424). The FORGE implementation is, however, faithful to the FORGE paper:

  • The paper defines the action as a relative pose applied to the fixed part's pose to obtain an absolute target, which "is clipped by an action scale, λ, to ensure that the target is not too far from the EE's current pose" (Sec. III-B, Eq. 6), and states "we allow targets to be up to 5cm away in all directions" (the operational volume, i.e. pos_action_bounds = 0.05).
  • λ is listed in the paper's dynamics randomization table (Appendix A) as "Action Scale: λ ∈ [1.6, 2.5] cm". With the implementation's noise scheme (x * m or x / m with m = 1 + U(0,1) · noise_level), pos_action_threshold = 0.02 and pos_threshold_noise_level = 0.25 give exactly 0.02 × [1/1.25, 1.25] = [1.6, 2.5] cm. The same check works for the controller gains: 565 × [1/1.41, 1.41] ≈ [400, 800], matching the paper's "Controller Gains [400, 800]".
  • Since FORGE actions are positions (not displacements), the per-step motion limit can only be realized as a clip on the delta to the current EE pose — a multiplicative scale on the action cannot play that role in this action space.

Because this keeps being read as a bug, this PR documents both semantics where a reader would look for them: a class docstring on CtrlCfg (Factory semantics), a class docstring on ForgeCtrlCfg (FORGE semantics, the λ correspondence and the randomization ranges), and an expanded ForgeEnv._apply_action docstring with the paper reference.

Documentation-only change; no behavior is modified.

Related to #5424 (a detailed analysis is posted on the issue).

Type of change

  • Documentation update

Checklist

  • I have read and understood the contribution guidelines
  • I have run the pre-commit checks with ./isaaclab.sh --format
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • I have added tests that prove my fix is effective or that my feature works
  • I have updated the changelog and the corresponding version in the extension's config/extension.toml file
  • I have added my name to the CONTRIBUTORS.md or my name already exists there

The Factory and FORGE environments share the CtrlCfg fields
pos/rot_action_bounds and pos/rot_action_threshold, but use them in
opposite roles: Factory actions are displacements relative to the
current end-effector pose (threshold scales the per-step action,
bounds clip the target relative to the fixed asset), while FORGE
actions are absolute targets relative to the fixed asset (bounds map
the action onto the operational volume, the randomized threshold
clips the per-step motion). This mirrors the action scale (lambda)
defined in Sec. III-B, Eq. 6 of the FORGE paper and its Appendix A
randomization ranges, but has repeatedly been read as an accidental
swap when comparing the two environments side by side.

Document both semantics on CtrlCfg, ForgeCtrlCfg and
ForgeEnv._apply_action with references to the paper.

Related to isaac-sim#5424

Signed-off-by: Advait Jayant <advait@vannalabs.ai>
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This is a documentation-only PR that adds class-level docstrings to CtrlCfg (Factory) and ForgeCtrlCfg (FORGE), and an expanded method docstring to ForgeEnv._apply_action, to explain why pos_action_bounds/rot_action_bounds and pos_action_threshold/rot_action_threshold play inverted roles in the two environment families. No runtime behaviour is modified.

  • CtrlCfg in factory_env_cfg.py now documents Factory semantics (thresholds scale per-step displacement; bounds clip the workspace target) and cross-references ForgeCtrlCfg.
  • ForgeCtrlCfg in forge_env_cfg.py now documents FORGE semantics (bounds map the action onto the operational volume; thresholds clip per-step motion as λ in the paper), with the randomization range derivation and an arXiv citation.
  • ForgeEnv._apply_action in forge_env.py now restates the same semantics inline at the call site, with a direct reference to Sec. III-B, Eq. 6 of the FORGE paper.

Confidence Score: 4/5

Safe to merge — purely documentation additions with no runtime impact.

All four changed files are documentation-only. The docstrings accurately reflect the implementation verified by reading the actual _apply_action code. The one minor point is that the _apply_action docstring references pos_threshold/rot_threshold (the runtime tensor names) while ForgeCtrlCfg and the rest of the codebase use pos_action_threshold/rot_action_threshold (the config field names), which could puzzle a reader cross-referencing the config class.

The _apply_action docstring in forge_env.py uses pos_threshold/rot_threshold where all other references use the config-field names pos_action_threshold/rot_action_threshold; worth aligning for consistency.

Important Files Changed

Filename Overview
source/isaaclab_tasks/isaaclab_tasks/contrib/forge/forge_env.py Expanded _apply_action docstring with Factory vs FORGE action-space comparison and paper reference; minor inconsistency: docstring uses runtime attribute names (pos_threshold/rot_threshold) instead of the config field names used everywhere else.
source/isaaclab_tasks/isaaclab_tasks/contrib/forge/forge_env_cfg.py Added clear ForgeCtrlCfg docstring explaining FORGE action-space semantics, λ correspondence, and paper reference; accurate and consistent with the implementation.
source/isaaclab_tasks/isaaclab_tasks/contrib/factory/factory_env_cfg.py Added CtrlCfg docstring documenting Factory action-space semantics and pointing readers to ForgeCtrlCfg for the different FORGE interpretation.
source/isaaclab_tasks/changelog.d/0xadvait-clarify-forge-ctrl-semantics.rst New changelog entry accurately summarising the documentation clarification; no issues.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Factory["Factory (_apply_action)"]
        FA["Normalized action [-1, 1]"]
        FA -->|"× pos_action_threshold\n(scales to per-step displacement)"| FD["Per-step delta pose\n(relative to current EE)"]
        FD -->|"target = current_EE + delta"| FT["Absolute target pose"]
        FT -->|"clip with pos_action_bounds\n(workspace bound rel. fixed asset)"| FC["Clipped target → controller"]
    end

    subgraph FORGE["FORGE (_apply_action)"]
        GA["Normalized action [-1, 1]"]
        GA -->|"× pos_action_bounds\n(scales onto operational volume)"| GT["Absolute target pose\n(relative to fixed asset)"]
        GT -->|"delta = target − current_EE"| GD["Per-step delta pose"]
        GD -->|"clip with pos_action_threshold (λ)\n(limits per-step motion)"| GC["Clipped target → controller"]
    end
Loading

Reviews (1): Last reviewed commit: "Clarify Factory vs FORGE action-space co..." | Re-trigger Greptile

Comment on lines +161 to +164
* ``pos_threshold`` and ``rot_threshold`` clip the per-step motion of the target relative to the
current end-effector pose. They correspond to the action scale (lambda) in the FORGE paper,
which is randomized per episode as part of the dynamics randomization scheme and exposed to
the critic as privileged state.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The docstring uses the runtime attribute names `pos_threshold` and `rot_threshold` (i.e., self.pos_threshold/self.rot_threshold) where the ForgeCtrlCfg docstring and all other cross-references consistently use the config field names `pos_action_threshold` and `rot_action_threshold`. A reader searching the config class for `pos_threshold` won't find it, making the cross-reference harder to follow.

Suggested change
* ``pos_threshold`` and ``rot_threshold`` clip the per-step motion of the target relative to the
current end-effector pose. They correspond to the action scale (lambda) in the FORGE paper,
which is randomized per episode as part of the dynamics randomization scheme and exposed to
the critic as privileged state.
* ``pos_action_threshold`` and ``rot_action_threshold`` clip the per-step motion of the target relative to the
current end-effector pose. They correspond to the action scale (lambda) in the FORGE paper,
which is randomized per episode as part of the dynamics randomization scheme and exposed to
the critic as privileged state.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed -- updated to use the config field names consistently, keeping a note on the runtime pos_threshold/rot_threshold tensors they are applied through (those are what the code below actually reads, post-randomization).

Reference pos_action_threshold/rot_action_threshold consistently with
the ForgeCtrlCfg docstring, and note the runtime tensors they are
applied through.

Signed-off-by: Advait Jayant <advait@vannalabs.ai>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

isaac-lab Related to Isaac Lab team

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant