gbl08ma
gbl08ma.com
gbl08ma
@gbl08ma.com
Software developer. Most comfortable with Go, can do frontend too. I built jungletv.live, underlx.com, tny.im and more. I know more about the internals of GTA V and Watch Dogs than their average players. Sometimes I make music and high effort shitposts.
would be reached, but that would be a problem for future me.
November 20, 2025 at 10:55 AM
If the problem worsens to the point where most big legitimate PDS don't get their own label, I think I would end up opting for a more "humane" approach, e.g. one where nobody gets a label by default and PDS operators have to apply to get one. Still not sure what I'd do when the max number of labels
November 20, 2025 at 10:55 AM
My worry is that, especially with the labeler being open source so the logic ends up being public, most simple metrics can be gamed and if the cat-and-mouse game continues, eventually we'll end up trying to detect LLMs, etc. - I don't want to play that game.
November 20, 2025 at 10:55 AM
Increased to top 100, labels are being reprocessed, should take just a couple more minutes (it needs to reevaluate 55k DIDs)
November 20, 2025 at 10:48 AM
I'll try to increase the number of PDS that get their own labels once again. But eventually this list will probably have to either end up being manually managed (ugh, not willing to do it) or the crawler will need to start looking into some more metrics besides the number of accounts.
November 20, 2025 at 10:42 AM
Don't forget the cut on every marketplace transaction, even if it's to take two cents out of every three cents transaction.
November 13, 2025 at 2:39 PM
(It's roughly a third of the lowest value mentioned in this thread so far)
November 12, 2025 at 9:57 PM
I think I won't disclose how much I'm paying for a three letter domain (@tny.im), should they realize they're undercharging.
November 12, 2025 at 9:53 PM
From what I've read and observed on a couple PDS, if you do that it won't take long for bots to begin creating accounts en masse in your PDS. Most PDS intended for public use require additional verification methods (e.g. email) for that reason.
November 12, 2025 at 3:34 PM
on responding to security/compliance concerns more so than any technology- or implementation-specific concerns.
November 10, 2025 at 11:22 AM
Because if it were something more fundamental about how LLMs work, the isolation would not be at the organization level but at a smaller level, as users probably wouldn't want "crosstalk" between unrelated requests within each organization, either. That passage in the docs seems especially focused
November 10, 2025 at 11:22 AM
dealing with keys and/or values that could contain sensitive information or which could help disclose sensitive information when coupled with one or two other side-channels.
November 10, 2025 at 11:15 AM
It probably isn't "bad," beyond protecting against the more "in general" bugs and side-channels. It makes it easier to reason about security and data privacy compliance. What I mean is that the reason they're doing that is probably not LLM specific, just something one would do in general when
November 10, 2025 at 11:15 AM
Then you're relying on specifics of how the keys and values are used to trust that nothing will go wrong, assuming that their use case will remain constant forever, and that there is no hidden detail that could "get you", and that's not how defense in depth works...
November 10, 2025 at 11:07 AM
it's easier to ensure nothing weird like that ever happens by simply not sharing the cache across tenants.
November 10, 2025 at 11:02 AM
But if you have a bug in that derivation process that caused the wrong values to be pulled every now and then, that will surely influence the response in an incorrect way, and in a way which could directly or indirectly leak information about other requests being made. So to be on the safe side,
November 10, 2025 at 11:02 AM
tenant, or make it so that the tenant ID is always part of the cache key no matter what, you prevent not just the timing attacks but also information leaking due to the cache being accessed with keys that don't encapsulate all of the information they should have.
November 10, 2025 at 10:56 AM
process to derive a key from the prompt is bugged, e.g. someone forgot to take into account some setting like the system prompt or the temperature, and now the keys being used to access the cache are the same even though the requests are different. If you use an actually separate cache structure per
November 10, 2025 at 10:56 AM
additional data that influences the value and which you forgot to take into account as part of the key, and now the cache is responding with values that aren't actually the correct ones for the request being made, and which could be polluted with sensitive data from other tenants. Or maybe that
November 10, 2025 at 10:56 AM
Consider a cache that's implemented with a dictionary/map. To retrieve a value from a cache you usually need a key, which in this case would be the prompt (or something derived from it), to identify the value to retrieve (the value would be the response to the prompt). But there could be some
November 10, 2025 at 10:56 AM
That's usually the main reason to isolate caches per tenant. It also makes it more difficult for data to accidentally leak between tenants (e.g. because cache keys could turn out not to be as unique as the developer assumed...). It could also naturally arise as a side-effect of sharding.
November 10, 2025 at 10:35 AM