Lightnews — Scholar-powered news

Pura Peetathawatchai

@poonpura.bsky.social

M.S. Computer Science @Stanford. Interested in machine learning privacy, AI security, diffusion models, cryptography, AI for environment, healthcare, education

🌱 poonpura.github.io

Posts Replies Media Videos

Pura Peetathawatchai

@poonpura.bsky.social

🙏🙏🙏

November 27, 2024 at 6:48 PM

Pura Peetathawatchai

@poonpura.bsky.social

🎓 I am also applying for PhD programs this Fall! If you think I am a good fit for your lab, please contact me at [email protected] 😄

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

For details, check out our paper (feedback appreciated!):

📄: arxiv.org/abs/2411.14639
🙌: big thank you to my collaborators and mentors Wei-Ning Chen, @berivanisik.bsky.social, Sanmi Koyejo, Albert No
🧵 16/16

Differentially Private Adaptation of Diffusion Models via Noisy Aggregated Embeddings

We introduce novel methods for adapting diffusion models under differential privacy (DP) constraints, enabling privacy-preserving style and content transfer without fine-tuning. Traditional approaches...

arxiv.org

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

We tried generating images using different values of subsample size (m) and DP parameter ε. Our results were particularly good for Textual Inversion (TI)!

🧵 15/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

We tested the effectiveness of our approach on two different target datasets: a collection of artworks from an artist (with consent, see her art on Instagram: @eveismyname) and the Paris 2024 Olympic pictograms (approved for non-commercial editorial use, ©️IOC - 2023)

🧵 14/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

By only aggregating over a smaller sample of the target embeddings, we can enhance the strength of our DP guarantees. This allows us to achieve the same privacy guarantees with much less noise, and hence much better image quality! ✨

🧵13/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

For a bigger privacy-utility boost, we can also introduce subsampling. [1]

[1] arxiv.org/abs/2210.00597
🧵 12/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

4. Apply noisy aggregated embedding to Style Guidance or Textual Inversion 🔥
5. Serve and enjoy! 🍴

For details, see our paper:
📄: arxiv.org/abs/2411.14639
🧵 11/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

Our recipe can be summarized as follows: 🍳

1. Obtain an embedding vector for each image in the target dataset 🌿
2. Aggregate the embeddings to limit sensitivity to individual image 🥣
3. Add DP noise using the Gaussian mechanism 🧂

🧵 10/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

2. Textual Inversion [1] (use the target dataset to train a new token embedding vector that is later used in the text prompt during image generation)

[1] arxiv.org/abs/2208.01618
🧵 9/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

1. Universal Guidance’s CLIP style guidance [1] (guide image towards target CLIP embedding during image generation)

[1] arxiv.org/abs/2302.07121
🧵 8/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

But here, we propose a new approach using embedding vectors.

Our work focuses on applying DP to known diffusion model adaptation approaches that involve encoding the target dataset into an embedding vector, including:

🧵7/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

We therefore turn to other DP approaches that don’t require full training using DP-SGD. Some work has been done on this, such as DP-LoRA [1] (utilizing Low-Rank Adaptation) and DP-RDM [2] (utilizing Retrieval Augmented Generation).

[1] arxiv.org/abs/2110.06500
[2] arxiv.org/abs/2403.14421
🧵 6/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

But while DP-SGD is powerful, it struggles with:
1. High computational costs
2. Incompatibility with batch normalization
3. Severe degradation in image quality

🧵5/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

The first solution that comes to mind is differential privacy (DP), which adds noise to provide data privacy. DP-SGD [1] is particularly popular for neural networks, and work has been done to adapt DP-SGD to diffusion models.

[1] arxiv.org/abs/1607.00133
🧵 4/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

This means the model might directly recreate training images instead of generalizing patterns. This poses copyright concerns for artists and privacy issues for sensitive datasets.©️

🧵 3/16

November 27, 2024 at 6:43 PM

Pura Peetathawatchai

@poonpura.bsky.social

Diffusion models like Stable Diffusion have revolutionized image generation and can be personalized on smaller datasets to capture specific objects or styles. But personalizing on small datasets risks memorization.

🧵 2/16

November 27, 2024 at 6:43 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news