Lightnews — Scholar-powered news

@dasaemjeong.bsky.social

150 followers 240 following 7 posts

MIR / Assistant Prof. @ Sogang University, Seoul /

Posts Replies Media Videos

dasaemjeong.bsky.social

@dasaemjeong.bsky.social

By training a model to generate audio tokens from given score image, the model learn how to read notes from the score image. This led our model to break SOTA for OMR! Vice versa for AMT can work, while the gain was not significant enough compared to the OMR.

May 23, 2025 at 1:44 PM

dasaemjeong.bsky.social

@dasaemjeong.bsky.social

Score videos are slideshow of audio-aligned score image. Although they does not include any machine-readable symbolic data, we thought these score image - audio pairs can be used for understand each modality, because they share same semantic in (hidden) symbolic music domain.

May 23, 2025 at 1:43 PM

dasaemjeong.bsky.social

@dasaemjeong.bsky.social

Music exists in various modal, and the translation between modality is important MIR Tasks.
Score Image→Symbolic Music: OMR
Audio → MIDI: AMT
MIDI → Audio: Synthesis
Score → Performance MIDI: Performance Rendering
Audio → Music Notation: Complete AMT

May 23, 2025 at 1:42 PM

dasaemjeong.bsky.social

@dasaemjeong.bsky.social

🎶Now a neural network can read scanned score image and generate performance audio in end-to-end😎
I'm super excited to introduce our work on Unified Cross-modal translation between Score Image, Symbolic Music, and Audio.
Why does it matter and how to make it? Check the thread🧵

May 23, 2025 at 1:38 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news