Lightnews — Scholar-powered news

CanPath - Canadian Partnership for Tomorrow's Health

@canpath.bsky.social

Working with multi-regional data? 🗺️

Learn how Maelstrom Research tackles data harmonization using CanPath as a case study.

📅 June 10 | 🕐 1 PM EDT

🔗 Register >> us02web.zoom.us/webinar/regi...

#HealthData #DataHarmonization #CohortResearch

CanPath Webinar: Lessons from CanPath on demystifying data harmonization.
Dr. Isabel Fortier, Principal Investigator, Maelstrom Research.
Anouar Nechba, Project Manager, Maelstrom Research.
Tuesday, June 10, 2025.
1:00 pm - 2:00 pm EDT.

May 9, 2025 at 4:01 PM

Matasoft

@matasoft.bsky.social

Conduct HR surveys or customer feedback? Automate text analysis and topic clustering in bulk spreadsheets using cutting-edge LLM-powered workflows.
🔗 matasoft.hr/qtrendcontro...
#HRAutomation #NLP #SurveyTools #DataHarmonization #Automation #SmartTools #LLM #CloudComputing #OnPremAI #ScalableAI

Matasoft's AI-Driven Spreadsheet Processing Services and Software

Transform your business data workflows with Matasoft’s AI-driven spreadsheet processing services and software. (Un)Perplexed Spready, powered by Perplexity AI, automates data extraction, categorizatio...

matasoft.hr

September 12, 2025 at 3:49 PM

Lex Maliga Davis (She/Her/Hers)

@lexmaligadavis.bsky.social

Check out the newest ODC-TBI blog post about the most recent CDE policy!

odc-tbi.org/about/odc_tb...

#commondataelements #ocd-tbi #datasharing #dataharmonization

odc-tbi | Introducing the new ODC-TBI policy on Common Data Elements

The ODC-TBI Editorial Board has just voted to encourage the use of a set of standard Common Data Elements (CDEs) for datasets submitted to ODC-TBI.

odc-tbi.org

February 24, 2025 at 7:45 PM

Matasoft

@matasoft.bsky.social

Hate wrangling spreadsheets with different data schemas? (Un)Perplexed Spready normalizes structures and names, harmonizing workbooks in bulk.
🔗 matasoft.hr/qtrendcontro...
#DataHarmonization #Automation #SmartTools #LLM #CloudComputing #OnPremAI #ScalableAI #BigData #DataProcessing #AI #Data

Matasoft's AI-Driven Spreadsheet Processing Services and Software

Transform your business data workflows with Matasoft’s AI-driven spreadsheet processing services and software. (Un)Perplexed Spready, powered by Perplexity AI, automates data extraction, categorizatio...

matasoft.hr

September 11, 2025 at 5:32 PM

Abhishek Jha

@ajelucidata.bsky.social

All normalized to biomedical ontologies, such as MONDO and UBERON.

All traceable to the source.

Read the full preprint - biorxiv.org/content/10.1...

#FAIRdata #metadata #biomedicalAI #healthcaredata #dataharmonization #AI

biorxiv.org

June 19, 2025 at 7:07 PM

JMIR Publications

@jmirpub.bsky.social

JMIR Formative Res: Automated Data Harmonization in Clinical Research: Natural Language Processing Approach #DataHarmonization #ClinicalResearch #NaturalLanguageProcessing #NLP #MachineLearning

Automated Data Harmonization in Clinical Research: Natural Language Processing Approach

Background: Integrating data is essential for advancing clinical and epidemiological research. However, because datasets often describe variables (e.g., demographic, health conditions, etc.) in diverse ways, the process of integrating and harmonizing variables from research studies remains a major bottleneck. Objective: The objective was to assess a natural language processing (NLP)-based method to automate variable harmonization to achieve a scalable approach to integration of multiple datasets. Methods: We developed a fully connected neural network (FCN) method, enhanced with contrastive learning, using domain-specific embeddings from the BioBERT language representation model, using three cardiovascular datasets: the Atherosclerosis Risk in Communities (ARIC) study, the Framingham Heart Study (FHS) and the Multi-Ethnic Study of Atherosclerosis (MESA). We used metadata variable descriptions and curated harmonized concepts as ground truth. We framed the problem as a paired sentence classification task. The accuracy of this method was compared to a logistic regression baseline method. To assess the generalizability of the trained models, we also evaluated their performance by separating the three datasets when preparing the training and validation sets. Results: The newly developed fully connected neural network (FCN) achieved a top-5 accuracy of 98.95% (95% CI: 98.31%-99.47%) and an AUC of 0.990 (95% CI: 0.988-0.991), outperforming the standard logistic regression model, which exhibited a top-5 accuracy of 22.23% (95% CI: 19.91% - 24.87%) and an AUC of 0.824 (95% CI: 0.815 – 0.834). The contrastive learning enhancement also outperformed the logistic regression model, although slightly below the base FCN model, exhibiting a top-5 accuracy of 89.88% (95% CI: 87.88% - 91.68%) and an AUC of 0.977 (95% CI: 0.975 – 0.979). Conclusions: This novel approach provides a scalable solution for harmonizing metadata across large-scale cohort studies. The proposed method significantly enhances the performance over the baseline method by utilizing learned representations to categorize harmonized concepts more accurately for cohorts in cardiovascular disease and stroke.

dlvr.it

August 27, 2025 at 2:40 PM

DataPrudence

@dataprudence.bsky.social

Merging biological datasets like public, proprietary, or platform-specific is no simple task.

It takes real expertise to resolve inconsistencies, integrate them seamlessly, and extract insights.

#bioinformatics #computationalbiology #dataintegration #dataharmonization

July 3, 2025 at 5:18 AM

CanPath - Canadian Partnership for Tomorrow's Health

@canpath.bsky.social

TOMORROW: Learn how Maelstrom Research tackles data harmonization using CanPath as a case study.

🗓️ June 10 | 🕐 1 PM EDT

🔗 Register: canpath.ca/2025/05/less...
#DataHarmonization #HealthData #CohortResearch

Lessons from CanPath on demystifying data harmonization - CanPath - Canadian Partnership for Tomorrow’s Health

Data harmonization is essential to produce multi-cohort research. Learn more about best practices for harmonizing large-scale cohort data.

canpath.ca

June 9, 2025 at 7:03 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news