Skip to content
View slacki-ai's full-sized avatar

Block or report slacki-ai

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Popular repositories Loading

  1. trait-inoculation trait-inoculation Public

    Inoculation prompting experiment: trait distillation + checkpoint evaluation (French/Playful traits)

    Python 1

  2. rl-misalignment-envs rl-misalignment-envs Public

    RL environments that produce emergent misalignment in LLMs — replications of Sycophancy→Subterfuge, Goal Misgeneralization, and Natural EM

    Python 1

  3. openweights openweights Public

    Forked from longtermrisk/openweights

    A python sdk for LLM finetuning and inference on runpod infrastructure

    Python 1

  4. shaping-motiv-expl shaping-motiv-expl Public

    Shaping motivations experiment: disentangling mechanisms that prevent emergent misalignment

    Python 1

  5. claudex-demo claudex-demo Public

    Demo: Gradient Leading Terms in Attention-Only Transformers (Im et al., ICLR 2026)

    Python

  6. grad-interp grad-interp Public

    grad-interp

    Python