Lightnews — Scholar-powered news

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

140 followers 650 following 20 posts

PhD Fellow at the CopeNLU Group, University of Copenhagen; working on explainable automatic fact-checking . Prev: NYU Abu Dhabi, IIT Kharagpur.
https://mainuliitkgp.github.io/

Posts Replies Media Videos

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

This work was conducted at @copenlu.bsky.social under the guidance of my amazing supervisors, @iaugenstein.bsky.social and @apepa.bsky.social.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

Overall, this work advances understanding of how LLMs integrate internal and external knowledge by introducing the first systematic framework for multi-step analysis of knowledge interactions via rank-2 subspace disentanglement.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

💡How is the CoT mechanism aligned with the knowledge interaction subspace?
📊 CoT maintains similar CK alignment compared to standard prompting for all the datasets, and also reduces PK alignment.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

💡 Can we find reasons for hallucinations based on PK-CK interactions?
📊 The gap between PK and CK is much higher for the examples with hallucinated spans than for the examples with no hallucinated spans across the sequence steps.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

💡 How do individual PK and CK contributions change over the NLE generation steps for different knowledge interactions?
📊 During most of the NLE generations, the model slightly prioritizes PK.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

💡 How do individual PK and CK contributions change over the NLE generation steps for different knowledge interactions?
📊 While generating an answer, the model aligns with the CK direction for conflicting examples, while for supportive examples, the model aligns with PK.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

🪛 We propose a novel rank-2 projection subspace that disentangles PK and CK contributions more accurately and use it for the first multi-step analysis of knowledge interactions across longer NLE sequences.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

💡 Is a rank-1 projection subspace enough for disentangling PK and CK contributions in all types of knowledge interaction scenarios?
📊 Different knowledge interactions are poorly captured by the rank-1 projection subspace in LLM model parameter

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

Prior work has largely examined only single-step generation – typically the final answer, and has modelled PK–CK interaction only as a binary choice in a rank-1 subspace. This overlooks richer forms of interaction, such as complementary or supportive knowledge.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

🤔 NLEs illustrate the underlying decision-making process of LLMs in a human-readable format and reveal the utilization of PK and CK. Understanding their interaction is key to assessing the grounding of NLEs, yet it remains underexplored.

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

I am excited to share our new preprint answering this question:
"Multi-Step Knowledge Interaction Analysis via Rank-2 Subspace Disentanglement"

📄 Paper: arxiv.org/pdf/2511.01706
💻 Code: github.com/copenlu/pk-c...

arxiv.org

November 6, 2025 at 3:02 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

👩‍🔬 Huge thanks to my brilliant co-authors from @copenlu.bsky.social (led by @iaugenstein.bsky.social ) — @nadavb.bsky.social , Siddhesh Pawar, @haeunyu.bsky.social , and @rnv.bsky.social .
@aicentre.dk

August 15, 2025 at 10:07 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

📊 Key Takeaways:
3️⃣ Real & Fictional Bias Mitigation: Reduces both real-world stereotypes (e.g., “Italians are reckless drivers”) and fictional associations (e.g., “citizens of a fictional country have blue skin”), making it useful for both safety and interpretability research.

August 15, 2025 at 10:07 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

📊 Key Takeaways:
2️⃣ Strong Generalization: Works on unseen biases during token-based fine-tuning.

August 15, 2025 at 10:07 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

📊 Key Takeaways:
1️⃣ Consistent Bias Elicitation: BiasGym reliably surfaces biases for mechanistic analysis, enabling targeted debiasing without hurting downstream performance.

August 15, 2025 at 10:07 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

BiasGym consists of two components:
BiasInject: injects specific biases into the model via token-based fine-tuning while keeping the model frozen.
BiasScope: leverages these injected signals to identify and steer the components responsible for biased behaviour.

August 15, 2025 at 10:07 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

💡 Our Approach: We propose BiasGym, a simple, cost-effective, and generalizable framework for surfacing and mitigating biases in LLMs through controlled bias injection and targeted intervention.

August 15, 2025 at 10:07 AM

Sekh (Sk) Mainul Islam

@sekh-copenlu.bsky.social

🔍 Problem: Biased behaviour of LLMs is often subtle and non-trivial to isolate, even when deliberately elicited, making systematic analysis and debiasing particularly challenging.

August 15, 2025 at 10:07 AM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news