• Data valuation (how much did each training datapoint contribute towards generating the samples my users pay me for?)
• Data valuation (how much did each training datapoint contribute towards generating the samples my users pay me for?)
• Quantifying impact of copyrighted data on a given sample (how much less likely is it that the model would generate this image if not for the works of a given artist?)
• Quantifying impact of copyrighted data on a given sample (how much less likely is it that the model would generate this image if not for the works of a given artist?)
They give an approximate answer, but without actually retraining the model.
They give an approximate answer, but without actually retraining the model.