Lightnews — Scholar-powered news

AI & ML News

@ai-news.at.thenote.app

Gemini Live API Now GA on Vertex AI

Google announces the general availability of Gemini Live API on Vertex AI, using the Gemini 2.5 Flash Native Audio model. This new API enables the creation of real-time, multimodal AI agents that understand voice, vision…

Telegram AI Digest
#ai #gemini #geminiai

Gemini Live API Now GA on Vertex AI

Google announces the general availability of Gemini Live API on Vertex AI, using the Gemini 2.5 Flash Native Audio model. This new API enables the creation of real-time, multimodal AI agents that understand voice, vision, and text. The Gemini 2.5 Flash model provides the power for human-like conversational intelligence in enterprise applications. The API handles interruptions, acoustic cues, and complex visual data for natural interaction. Vertex AI provides the necessary security, stability, and global infrastructure for enterprise deployments. Several companies are already leveraging Gemini Live API to enhance customer experiences. Shopify's Sidekick uses the API for multimodal assistance, while UWM's Mia boosts business efficiency. SightCall offers visual support, and Napster enables co-creation through AI companions. Lumeris uses it for patient care, Newo for AI receptionists, and 11Sight for sales agents. Developers can start building with Gemini Live API through Vertex AI Studio and related resources.

cloud.google.com

December 13, 2025 at 11:42 PM

stevekay686.bsky.social

@stevekay686.bsky.social

1/ AI moves from demos → infrastructure.
Agent-native systems, multimodal data pipelines, and AI-first stacks become default.

December 13, 2025 at 2:36 PM

Nickey Khem

@nickeykhem.bsky.social

Interesting to see GPT-5's benchmark focus shift beyond just text. The multimodal leap is the real story—AI that truly *understands* context across formats is the next frontier for builders. The architecture implications here are huge.

www.geeky-gadgets.com

December 13, 2025 at 12:44 PM

OpsMatters

@opsmatters.com

The latest update for #Antino includes "How to Build a Shopping App Like Temu: Features, Tech Stack & Cost" and "Multimodal AI Applications, Use cases and Everything Else you need to know".

#SoftwareDevelopment #MobileApp https://opsmtrs.com/3xp3SAU

Antino

Antino Labs is one of the leading tech services companies that aims to transform clients' business by using the latest technology and models as per the modern digital era.

opsmtrs.com

December 13, 2025 at 2:24 AM

JauntyWunderKind

@jauntywk.bsky.social

What was the recent multimodal AI model that just launched?

Wanna go compare some data sheets,
github.com/sapienzaapps...

Support newer MPU · Issue #6 · sapienzaapps/seismocloud-sensor-nodemcu

Hello. The MPU6050 is a 15 year old chip. What could we do with a more up to date mpu? Ideally imo this project could support more than the one mpu6050. This Arduino thread mentions: LIS3DH LSM6DSO...

github.com

December 12, 2025 at 10:45 PM

Eyal Estrin ☁️

@eyalestrin.bsky.social

GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI #machinelearning #ai

GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI

Eyal Estrin unread, 3:08 PM (1 hour ago)    to https://www.microsoft.com/en-us/research/blog/gigatime-scaling-tumor-microenvironment-modeling-using-virtual-population-generated-by-multimodal-ai/ Eyal Estrin CISSP, CCSP, CISM, CISA, CDPSE, CCSK Blog: https://security-24-7.com | Books: https://amzn.to/42Xai9A | https://amzn.to/3Sggbtv Twitter: @eyalestrin | Bluesky: @eyalestrin.bsky.social

groups.google.com

December 12, 2025 at 8:44 PM

lawrencewilsonbsky.bsky.social

@lawrencewilsonbsky.bsky.social

“IBM RAG and Agentic AI Professional Certificate”: Agentic systems, Prompt Engineering, LLM Application, LangGraph, Tool Calling, Multimodal Prompts,LLM, Responsible AI, App & Software Development, Gen AI Agents, Application Design, Machine Learning, Enroll today: imp.i384100.net/o42MO9 AI Academy 😀

December 12, 2025 at 8:10 PM

TechPlanetToday

@techplanet.today

Generative AI vs Multimodal AI 2025: Understanding the Key Differences

#generative ai vs multimodal ai #multimodal ai #google gemini ai #vertex ai

Generative AI vs Multimodal AI 2025: Understanding the Key Differences

Artificial intelligence has evolved at an astonishing pace, and by 2025, AI is no longer limited to generating text or images. Two of the most talked-about technologies in the AI space today are Generative AI and Multimodal AI. While they may sound s...

techplanet.today

December 12, 2025 at 5:27 PM

Science X / Phys.org

@sciencex.bsky.social

A new unified framework organizes AI methods by how they retain or discard information, aiming to streamline algorithm selection and improve efficiency in multimodal AI systems. doi.org/hbfhvt

'Periodic table' for AI methods aims to drive innovation

Artificial intelligence is increasingly used to integrate and analyze multiple types of data formats, such as text, images, audio and video.

techxplore.com

December 12, 2025 at 5:00 PM

Posit

@posit.co

Join Wes McKinney (@wesmckinney.com) and the Pixeltable @pixeltable.net team, Marcel Kornacker and Alison Hill (@apreshill.com), for a fireside chat hosted by Hugo Bowne-Anderson on Dec 16!

They will discuss data processing and #AI workflows for multimodal data 📊

Register: luma.com/2y04b6nf

Building Multimodal AI Workflows with Pixeltable · Luma

The challenge with multimodal AI isn't calling models. It's everything else. Videos need to become frames. Audio needs transcription. Embeddings need to stay…

luma.com

December 12, 2025 at 4:20 PM

Timelines

@hulio-ai.bsky.social

#AI2025
🧠 Dementia: AI analyzes EEGs for Alzheimer's with 97% accuracy.
⚕️ Delphi-2M: Predicts disease risks for 1,256 conditions.
🎥 Multimodal AI: Creates realistic videos from text prompts.
#AI2025 #DementiaAI #Delphi2M #MultimodalAI
View in Timelines

December 12, 2025 at 3:01 PM

AI Adoption Agency

@aiadoptionagency.bsky.social

Marengo 3.0: Search video with words or images.

Native video AI.
Multimodal fusion.
Temporal reasoning.
Entity search.
Composed queries.
Multilingual.
Sports-smart.

Perfect for media, retail, security, education.

#Marengo3 #TwelveLabs #VideoAI

Read more:

aiadoptionagency.com/twelvelabs-m...

TwelveLabs Marengo 3.0: The Future of Multimodal Video AI - Ai Adoption Agency

Imagine you have a huge library of videos and you want to find the exact moment where a red car drives past a shop while someone says the word “weekend.” With normal tools, you would have to watch eve...

https://aiadoptionagency.com/twelvelabs-marengo-3-0-the-future-of-multimodal-video-ai/"

December 12, 2025 at 2:32 PM

stefanofago.bsky.social

@stefanofago.bsky.social

link.springer.com/article/10.1...
<< ...Impromptu, a model-driven engineering framework to support the creation, management and reuse of prompts for generative AI. Impromptu offers a domain-specific language (DSL) to define multimodal prompts in a modular and tool-independent way... >>

Impromptu: a framework for model-driven prompt engineering - Software and Systems Modeling

Generative artificial intelligence (AI) systems are capable of synthesizing complex artifacts such as text, source code or images according to the instructions provided in a natural language prompt. T...

link.springer.com

December 12, 2025 at 1:47 PM

akm32.bsky.social

@akm32.bsky.social

2/7
In 2025, AI excellence goes beyond language.
Multimodal models now integrate vision, hearing, and text
to create immersive experiences

December 12, 2025 at 1:31 PM

ahmadktk1.bsky.social

@ahmadktk1.bsky.social

3/7
I've seen firsthand how multimodal AI can revolutionize industries like healthcare and education. For instance, AI-powered tools can analyze medical images and patient data to provide more accurate diagnoses, saving lives and reducing costs by up to 30%.

December 12, 2025 at 1:02 PM

ahmadktk1.bsky.social

@ahmadktk1.bsky.social

2/7
In 2025, language models like LLaMA and PaLM have set new standards for natural language processing, with 90%+ accuracy in understanding human language. But that's not all - multimodal AI is emerging as the next big thing.

December 12, 2025 at 1:02 PM

ahmadktk1.bsky.social

@ahmadktk1.bsky.social

1/7
What if I told you AI has advanced to the point where it can learn from multimodal inputs, transforming the way we interact with technology?

December 12, 2025 at 1:02 PM

ahmadktk1.bsky.social

@ahmadktk1.bsky.social

1/7
What if AI could understand us beyond words?
In 2025, multimodal AI is transforming interactions.

December 12, 2025 at 12:59 PM

Chat.ro

@chat-ro.bsky.social

Un nou model AI multimodal nativ a fost lansat: Qwen3-Omni-Flash! Aduce funcționalități avansate pentru conversație, analiză imagine/video și generare media. Testează și https://chat.ro, asistentul AI din România.

December 12, 2025 at 10:30 AM

Radiology: Artificial Intelligence

@radiology-ai.bsky.social

OCNet: A multimodal deep learning tool for classifying adnexal lesions on contrast-enhanced ultrasound https://doi.org/10.1148/ryai.240786 #cancer #AI #ML

Image from article in Radiology: Artificial Intelligence

December 12, 2025 at 7:15 AM

MonikaW

@monikawalker.bsky.social

Ex-DeepMind Researcher Pan Xin Joins Meituan to Lead Multimodal AI Innovation

Pan Xin, former Google DeepMind researcher and ex-head of multimodal AI platforms at ByteDance, has recently joined Meituan, according to multiple sources. Pan previously worked at Google on TensorFlow’s dynamic graph…

Ex-DeepMind Researcher Pan Xin Joins Meituan to Lead Multimodal AI Innovation

Pan Xin, former Google DeepMind researcher and ex-head of multimodal AI platforms at ByteDance, has recently joined Meituan, according to multiple sources. Pan previously worked at Google on TensorFlow’s dynamic graph mode and later held key AI roles at Baidu, Tencent, and ByteDance, focusing on deep-learning frameworks and visual/multimodal model platforms. In November 2024, he became an AI partner at FlashX, leading R&D for its smart-glasses initiative.

nexttech-news.com

December 11, 2025 at 11:01 PM

Next Tech

@newsnexttech.bsky.social

Ex-DeepMind Researcher Pan Xin Joins Meituan to Lead Multimodal AI Innovation

Pan Xin, former Google DeepMind researcher and ex-head of multimodal AI platforms at ByteDance, has recently joined Meituan, according to multiple sources. Pan previously worked at Google on TensorFlow’s dynamic graph…

Ex-DeepMind Researcher Pan Xin Joins Meituan to Lead Multimodal AI Innovation

Pan Xin, former Google DeepMind researcher and ex-head of multimodal AI platforms at ByteDance, has recently joined Meituan, according to multiple sources. Pan previously worked at Google on TensorFlow’s dynamic graph mode and later held key AI roles at Baidu, Tencent, and ByteDance, focusing on deep-learning frameworks and visual/multimodal model platforms. In November 2024, he became an AI partner at FlashX, leading R&D for its smart-glasses initiative.

nexttech-news.com

December 11, 2025 at 11:01 PM

Techy Ben

@techyben.bsky.social

@drmichaellevin.bsky.social Has anyone realised yet that humans to high fidelity slow updates of world models, but AI (llms etc) do fast low fidelity updates?

While sampling with AI is high fidelity in some domains, it's not yet multimodal...

December 11, 2025 at 10:30 PM

Radiological Society of North America

@rsnasky.bsky.social

Generative #AI is shaping radiology reports, but accuracy is key. A new #RSNA25 exhibit shows how radiologists validate multimodal AI–drafted reports. Read more: #MedicalImaging https://bit.ly/491SdeK

December 11, 2025 at 9:27 PM

MedChemExpress

@medchemexpress.bsky.social

📰New & Featured | #GigaTIME for #TME modeling
💡Provides a multimodal #AI framework to translate #H&E images into #mIF images
💡Effectively simulates virtual spatial #proteomics across large, heterogeneous patient cohorts
💡To study TME without #wet-lab assays for each sample

December 11, 2025 at 6:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news