Lightnews — Scholar-powered news

ACM SURE Workshop

@sureworkshop.bsky.social

Finally, stay in touch. We have an associated Discord (unorthodox, we know) to connect academics and practitioners: discord.gg/eVySXH7ZQ8

In fact, some of the attendees this year only made it due to the outreach on Discord. Come and chat!

SURE

The Workshop on Software Understanding and Reverse Engineering (SURE), hosting conversations on associated topics. From decompilation to source visualizati

discord.gg

October 17, 2025 at 3:58 PM

ACM SURE Workshop

@sureworkshop.bsky.social

Also, go read some of the papers:
sure-workshop.org/pa...

Keep, a lookout for our executive summary of papers/discussions/conclusions at SURE 2025 for those who could not attend IRL. We will post it in the coming days.

Accepted Papers | SURE 2025

Papers and posters accepted for SURE 2025

sure-workshop.org

October 17, 2025 at 3:58 PM

ACM SURE Workshop

@sureworkshop.bsky.social

Check out the paper:
sure-workshop.org/ac...

October 13, 2025 at 8:11 AM

ACM SURE Workshop

@sureworkshop.bsky.social

In the special sub-area of type inferencing on binary code, Noriki's work explores the recovery of structs and how different GNN architectures may have better performance.

October 13, 2025 at 8:11 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Check out the paper:
sure-workshop.org/ac...

October 13, 2025 at 7:52 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Indeed, LibIHT is more robust. They achieve better results on binaries that attempt to evade their analysis.

October 13, 2025 at 7:52 AM

ACM SURE Workshop

@sureworkshop.bsky.social

The magic happens at the kernel level. Their new tool LibIHT (github.com/libiht/li...), is implemented both at the user-space and kernel-space level.

This is important for speed and robustness against evasion techniques.

GitHub - libiht/libiht: Intel Hardware Trace Library - Kernel Space Componment

Intel Hardware Trace Library - Kernel Space Componment - libiht/libiht

github.com

October 13, 2025 at 7:52 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Often, when static analysis tools do not work, you need to get down in the weeds of a program and start dynamically analyzing it.

In Thomason's work, he explores a way to be more robust and efficient by utilizing hardware features for dynamic analysis.

October 13, 2025 at 7:52 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Find the paper here:
sure-workshop.org/ac...

October 13, 2025 at 6:59 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Now, you got your crazy code, how do you select which functions in the code to obfuscate and evaluate on?

Functions must be "sensitive" and "central". Sensitive: has sensitive info like a uid or gid or a password. Central: many other functions should depend on it (calls).

October 13, 2025 at 6:59 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Real world programs in their set need:
- unique functionality
- complex code
- ...

Some real programs: OpenSSL, QEMU, SQLite, curl, ... all difficult targets that are already hard to analyze, so they are not obfuscated.

October 13, 2025 at 6:59 AM

ACM SURE Workshop

@sureworkshop.bsky.social

An interesting observation: obfuscation is really expensive on the CPU. Real programs don't obfuscate the entire program; they only obfuscate critical code locations like license checks.

So they construct their dataset with that in mind.

October 13, 2025 at 6:59 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Dongpeng argues that many modern works in deobfuscation don't work on large complex programs. Instead, they are mostly tested on toy programs that are not real-world.

To make a more useful evaluation, they explore how real obfuscation is used.

October 13, 2025 at 6:58 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Interesting question: do specific features seem to matter more for the models? Example: constants.

So far, the answer is unclear. These models are very black-box and require more explainability.

October 13, 2025 at 6:39 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Takeaways:
- Training on obfuscation does help models, but it is not a silver bullet. This solution does not work well on obfuscation tech it has never seen before.

Check out the work:
sure-workshop.org/ac...

October 13, 2025 at 6:39 AM

ACM SURE Workshop

@sureworkshop.bsky.social

Some results: you train on obfuscation, and it turns out the model does do better (with BinShot) on obfuscated code. However, training it on specific types of obfuscation tech matters. For instance, training on control flow flattening may not help at all with MBA.

October 13, 2025 at 6:38 AM

ACM SURE Workshop

@sureworkshop.bsky.social

The reasoning task: binary code similarity detection. Do these two code snippets come from an identical source, and does obfuscation stop it?

October 13, 2025 at 6:38 AM

ACM SURE Workshop

@sureworkshop.bsky.social

They evaluate public obfuscation tools such as an LLVM obfuscator and the classic tool Tigress.

They have a few questions, one interesting one is:
Does training on obfuscated code actually make the models better at reasoning on them?

October 13, 2025 at 6:38 AM

ACM SURE Workshop

@sureworkshop.bsky.social

When reasoning on code, does it matter if it is obfuscated? The answer feels like a strong YES; however, how much does it matter for AI?

Jiyong's work explores this idea in a measurable way.

October 13, 2025 at 6:38 AM

ACM SURE Workshop

@sureworkshop.bsky.social

To run those tests, you can use LLMs! They get the input from decompilation and try to take the multiple-choice guess. It's important you measure probabilities along the way.

Check out the paper:
sure-workshop.org/ac...

October 13, 2025 at 6:10 AM

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news