Scott Lowe
scottclowe.bsky.social
Scott Lowe
@scottclowe.bsky.social
Machine learning researcher
There will also be continued interest in methods that allow more controllable manipulation of generated images (e.g. UIP2P arxiv.org/abs/2412.15216, Imagic arxiv.org/abs/2210.09276).

But maybe the pending copyright lawsuits will have major impacts on GenAI.
UIP2P: Unsupervised Instruction-based Image Editing via Cycle Edit Consistency
We propose an unsupervised model for instruction-based image editing that eliminates the need for ground-truth edited images during training. Existing supervised methods depend on datasets containing ...
arxiv.org
January 3, 2025 at 5:40 PM
For GenAI, improvements to generation quality are going to come from better data curation and value fns to drive the model toward high-quality outputs. Standard model training results in outputs representative of the training distribution, but users don't want average—we want the best quality.
January 3, 2025 at 5:39 PM
This is crucial for AGI, and will pose serious safety concerns. Models better at thinking outside the box and coming up with creative solutions will have broader implications than the prompter anticipated arxiv.org/abs/2412.04984.
Frontier Models are Capable of In-context Scheming
Frontier models are increasingly trained and deployed as autonomous agent. One safety concern is that AI agents might covertly pursue misaligned goals, hiding their true capabilities and objectives - ...
arxiv.org
January 3, 2025 at 5:38 PM
Humans solve novel problems on-the-fly using System 2 reasoning: AI needs this too. By learning reasoning steps at training time, at deployment the model can build new sequences of reasoning steps, enabling it to extrapolate.
January 3, 2025 at 5:37 PM
Reasoning capabilities are essential to reach robust perf in key ML products e.g. full self-driving. The distribution of driving scenarios is long-tailed, so even a model that covers most situations well may be faced with a novel situation outside its training data, but needs to respond correctly.
January 3, 2025 at 5:37 PM
One way to do this is hierarchical LLMs like the Large Concept Model arxiv.org/abs/2412.08821, Byte Latent Transformer arxiv.org/abs/2412.09871, and Block Transformer arxiv.org/abs/2406.02657.
Large Concept Models: Language Modeling in a Sentence Representation Space
LLMs have revolutionized the field of artificial intelligence and have emerged as the de-facto tool for many tasks. The current established technology of LLMs is to process input and generate output a...
arxiv.org
January 3, 2025 at 5:35 PM
For agentic models, the focus is shifting to System 2-like reasoning. OpenAI’s o1/o3 models demonstrate reasoning step-by-step can improve output quality by leveraging test-time compute. But impressive results on ARC are expensive, hence there will be a focus on improving test-compute efficiency.
January 3, 2025 at 5:34 PM
But the LLM training corpus is now the majority of worthwhile text humanity has ever written, and can't be meaningfully scaled further. As Ilya Sutskever put it at NeurIPS, "big data is the fossil fuel of AI".

With this in mind, what will be the next stage of AI development?
January 3, 2025 at 5:31 PM
This has some serious AI safety implications. Having an AI model able to classify what is in an image better than a human doesn't pose an existential threat. But when an AI model can perform long-term planning better than a human, "just unplug it" ceases to be a reliable solution
December 9, 2024 at 5:16 PM
In "System 2 Reasoning Capabilities Are Nigh", I lay out comparisons between human reasoning and reasoning in AI models, and argue that all the components needed to create AI models that can perform human-like reasoning already exist.
arxiv.org/abs/2410.03662
System 2 Reasoning Capabilities Are Nigh
In recent years, machine learning models have made strides towards human-like reasoning capabilities from several directions. In this work, we review the current state of the literature and describe t...
arxiv.org
December 9, 2024 at 5:13 PM
It's very easy to get started with using the dataset. The commands to download it and load it for PyTorch training fit in less than half a tweet:

!pip install bioscan-dataset
from bioscan_dataset import BIOSCAN5M
ds = BIOSCAN5M("~/Datasets/bioscan-5m", download=True)
December 9, 2024 at 5:08 PM
The dataset should be useful for a variety of research topics:
- multimodal learning
- fine-grained classification
- hierarchical labelling
- open-world classification/clustering
- semi- and self-supervised learning
December 9, 2024 at 5:00 PM
BIOSCAN-5M is a multimodal dataset for insect biodiversity monitoring. It consists of 5 million insect specimens from around the world, with a high-res microscopy image, DNA barcode, taxonomic labels, size, and geolocation info for each sample.
arxiv.org/abs/2406.127...
BIOSCAN-5M: A Multimodal Dataset for Insect Biodiversity
As part of an ongoing worldwide effort to comprehend and monitor insect biodiversity, this paper presents the BIOSCAN-5M Insect dataset to the machine learning community and establish several benchmar...
arxiv.org
December 9, 2024 at 4:57 PM