Justin Boylan-Toomey
banner
neocognitron.bsky.social
Justin Boylan-Toomey
@neocognitron.bsky.social
Lead Machine Learning Engineer for The Wellcome Trust. Interested in machine learning, AI art and data science for good. (he/him)
Looking for an open source coffee themed dataset that would be good for use in a linear regression tutorial. Anyone of you fantastic folk have any faves or suggestions?
#data #python #coffee
July 16, 2025 at 7:04 AM
Take part in the new @kaggle.com competition by @makedatacount.bsky.social, to not only win $40k but also help tackle a major unsolved #metascience and #bibliometrics challenge. Making it easier to identify and track the use and impact of scientific datasets!

www.kaggle.com/competitions...
Make Data Count - Finding Data References
Identify scientific data use in papers and classify how they are mentioned.
www.kaggle.com
June 24, 2025 at 9:32 PM
PyData London a 🧵

Great weekend at PyData London, promoting DataKind UK and the great work they do bringing together volunteer data talent to help third sector orgs deliver real world impact using data. Interested? There is a webinar on Monday on how to volunteer your skills for good, see below!
Welcome! You are invited to join a webinar: Using your data skills for good. After registering, you will receive a confirmation email about joining the webinar.
Would you like to use your data skills to make the world a better place? DataKind UK has relaunched its programmes, and we’d love you to join our community of data volunteers! Join our webinar and fi...
us02web.zoom.us
June 9, 2025 at 12:20 AM
Looking forward to the #PyDataLondon conference in a couple of weeks, in a great new venue! If you get a chance do stop by the #DataKind UK stand to say hi and discover how they bring together skilled volunteers to help charities make the most of their data!

pydata.org/london2025
PyData London | 2025
PyData London is a 3-day in-person event for the international community of data scientists, data engineers, and developers of data analysis tools.
pydata.org
May 27, 2025 at 12:46 AM
Reposted by Justin Boylan-Toomey
There were layoffs at MS yesterday and 3 #Python core devs from the Faster CPython team were caught in them. If you know of any jobs, please send them their way:

Eric Snow: www.linkedin.com/in/ericsnowc...
Irit Katriel: www.linkedin.com/in/irit-katr...
Mark Shannon: www.linkedin.com/in/mark-shan...
May 14, 2025 at 9:14 PM
Excited to share the new Wellcome Collection Knowledge Graph, enriching concept data to help people discover the breadth of the collection online. Developed through the work of talented colleagues on our MLE and Wellcome Collection teams.

stacks.wellcomecollection.org/enhancing-di...
#museum
Enhancing Discovery and Exploration: Leveraging Graph Technology for Wellcome Collection
Wellcome Collection’s catalogue records are tagged with concepts — keywords for things like subjects, contributors, languages, genres —…
stacks.wellcomecollection.org
April 2, 2025 at 10:29 PM
The Swiss National Science Foundation (SNSF) is looking for an expert statistician to help them evaluate the impact of their research funding. #statistics #career

recruitingapp-2829.umantis.com/Vacancies/85...
Statistician (80-100%)
recruitingapp-2829.umantis.com
February 9, 2025 at 11:45 AM
Excited about this, one month to get proposals in!
#metascience #scientometrics #research
Call for Proposals: Metascience 2025

Be part of the biggest Metascience meeting yet! We're inviting proposals for:
✅ Virtual symposia
✅ In-person panels
✅ Talks & posters

⏰ Submit by 7 Feb 2025
🔗 Full details: metascience.info/call-for-pro...
Call for Proposals - Metascience
The call for proposals for Metascience 2025 is open with a deadline of 7 February 2025.
metascience.info
January 7, 2025 at 2:06 AM
This looks like an awesome opportunity and a great way to get ready for an ML research degree!
#machinelearning #math #research
📢 Apply now for the 7-week #Mathematics #SummerSchool in London!

You will develop mathematics skills and intuition necessary to enter the #TheoreticalNeuroscience or #MachineLearning field.

ℹ️ www.ucl.ac.uk/gatsby/study-and-work/gatsby-bridging-programme

Please help spread the word!
January 7, 2025 at 1:47 AM
Watch Melissa and Tiffany from Data Kind present the work we did with Material Focus using geospatial data to help improve electrical recycling.
#python #statistics #datascience #recycling #environment
How to increase recycling using data
YouTube video by Data Science Festival
www.youtube.com
January 6, 2025 at 11:43 PM
Short blog post on a common pitfall which causes rows containing null values to be excluded from results when using the != or <> operator in Postgres.
#databases #postgres #sql #programming
Postgres Not Equal Only Returns Non-Null Values
As part of an analysis I recently ran a PostgreSQL query to return results filtered to remove rows in which the status column contained values not equal to "started". However after running the below q...
www.jboylantoomey.com
January 6, 2025 at 9:52 PM
According to one study "Almost one out of every five computer science papers published in the past four years may not have been written by humans."
The Wiley scandal illuminates a much bigger crisis of trust confronting universities around the world
Wiley's Hindawi scandal offers a window into a thriving black market of fake science, corrupted research and bogus authorship. It also illuminates a much broader crisis of trust confronting universiti...
www.abc.net.au
January 1, 2025 at 11:52 PM
How the MLE team at the Wellcome Trust use SciSpaCy to tag our grants with geographic locations. #machinelearning #nlp #python
Finding location entities in Wellcome grants
This blog post was written jointly by Matt Upson at MantisNLP and Arne Robben at Wellcome.
medium.com
January 1, 2025 at 11:47 PM
A thought provoking paper on the appropriateness of using finance and R&D analogies in measuring funding impacts. #researchfunding
Research Portfolio Analysis in Science Policy: Moving from Financial Returns to Societal Benefits
Funding agencies and large public scientific institutions are increasingly using the term "research portfolios" as a means of characterising their res
papers.ssrn.com
January 1, 2025 at 11:42 PM
Interesting approach to visualising bodies of research such as those within funding portfolios using overlay maps. #scientometrics
Science overlay maps: a new tool for research policy and library management
We present a novel approach to visually locate bodies of research within the sciences, both at each moment of time and dynamically. This article describes how this approach fits with other efforts to ...
arxiv.org
January 1, 2025 at 11:38 PM
Learning with Parents write up of the Data Kind UK data dive I was lucky enough to help with this year. The findings from which are being used to improve their platform and better help children and families facing the greatest barriers.
Learning with data - findings from our data dive - Learning with Parents
Earlier this year, we teamed up with DataKind UK to spend a whole weekend delving into the mountain of high-quality data produced by our learning platform. Here we take a look at some of the key takea...
learningwithparents.com
January 1, 2025 at 11:14 PM
Approaches to multi-modal classification of upstream oil and gas industry documents. #machinelearning #documents #python
Multimodal Document Classification
The automatic classification of documents remains an important and only partially solved information management problem within the upstream oil and gas industry. Companies in this sector have typicall...
www.jboylantoomey.com
January 1, 2025 at 11:04 PM
Some thoughts from 2020 on how to create reproducible data science projects, surprisingly still relevant today! #datascience #ai #responsibleai #python
Creating Reproducible Data Science Projects
A Nightmare ScenarioImagine you completed a one-off analysis a few months ago, creating a fairly complex data pipeline, machine learning model and visualisations. Fast forward to today and you have Em...
www.jboylantoomey.com
January 1, 2025 at 11:02 PM
Find out how the Wellcome Trust uses Apache Tika tika_pipes to extract text from and OCR millions PDF documents.
How to Parse Millions of PDF Documents Asynchronously with Apache Tika
Over the years, the Wellcome Trust has received a huge number of grant applications and funded thousands of research projects. As a result…
medium.com
January 1, 2025 at 10:55 PM