Lightnews — Scholar-powered news

Max Caldwell

@growth.wtf

0/3 + "you need help"

January 23, 2025 at 3:47 AM

Max Caldwell

@growth.wtf

As expected

January 23, 2025 at 3:46 AM

Max Caldwell

@growth.wtf

"reasoning"

January 23, 2025 at 3:45 AM

Max Caldwell

@growth.wtf

????

January 23, 2025 at 3:44 AM

Max Caldwell

@growth.wtf

It's crazy to me just how consistently wrong the google search llm is.

Is this like, a quantized gemini 1 that can't even see the search results it is summarizing? That's the only explanation for why doesn't it seem to get even context from featured snippets that they should have cached already.

January 1, 2025 at 11:17 PM

Max Caldwell

@growth.wtf

:/

December 12, 2024 at 4:10 AM

Max Caldwell

@growth.wtf

Sure, here's an example:
huggingface.co/datasets/Ple...

I am no expert but it seems legit?

The composition of the dataset adheres to the criteria for public domain works in the EU and, consequently, all Berne-countries for EU authors: any publication whose author is dead for more than 70 years. Additionally, the initial consolidation of public domain status for cultural heritage operates in the EU under the 2019 Copyright Directive (art. 14).

As of June 2024, to limit rights verification, we have retained exclusively titles published prior to 1884.

The corpus will be expanded at a later stage to encompass late 19th century and early 20th century publications, after checking for public domain validity.

December 6, 2024 at 12:11 AM

Max Caldwell

@growth.wtf

Something I've always been curious about with social networks is how much different actions lead to an audience. At scale, it would be interesting to learn how large accounts got large, but I'm also curious what gets people to follow. Is seeing a good post more influential? Or follow-backs? Well...

Alt text: Two posterior distribution plots displayed side by side. The left plot shows "Posterior ATE of 1 Follow on Following" with a mean of 0.4021 and 94% HDI ranging from -10.25 to 10.23, displayed as a bell curve centered near zero. The right plot shows "Posterior ATE of 1 Post on Following" with a mean of 2.759 and 94% HDI ranging from -6.404 to 12, also displayed as a bell curve. Both plots use a blue line to show the distribution and include x-axes ranging from approximately -30 to 40 and -20 to 30 respectively.

December 5, 2024 at 8:49 PM

Max Caldwell

@growth.wtf

From 2022 - tsunami danger zones in the bay area per the mercury news: www.mercurynews.com/2022/10/12/m...

December 5, 2024 at 7:22 PM

Max Caldwell

@growth.wtf

@alpindale.bsky.social Getting there, not quite getting the model injection code right yet.

I have to start my actual work now but when or if finish this I'll open source.

December 5, 2024 at 3:59 PM

Max Caldwell

@growth.wtf

OK, got a little DND game running with github.com/ag2ai/ag2 -- my review is that it's a neat set of concepts but very difficult to use and quite buggy. and I was hard-pressed to think of an actual use case that would fit into this.

December 5, 2024 at 8:09 AM

Max Caldwell

@growth.wtf

Hey @alpindale.bsky.social , I saw you're working on the ruliad project? Can you help explain how to get the model to follow scripted directions? I'm trying to write an ollama template to try to get it to use tools, or follow scripted tasks of any order.

December 5, 2024 at 6:50 AM

Max Caldwell

@growth.wtf

deepthoughts-8b-tools is confused, but it's got the spirit?

December 5, 2024 at 6:17 AM

Max Caldwell

@growth.wtf

Unfortunately it's just not trained for this. Even if you get it working with Ollama tool calling, the model is providing a total hallucination about half the time. and is unable to provide responses in standard tool calling format.

December 5, 2024 at 5:47 AM

Max Caldwell

@growth.wtf

Can I fix it?

December 5, 2024 at 4:46 AM

Max Caldwell

@growth.wtf

Tell me you're overthinking it without telling me you're overthinking it. Full reasoning chain in the alt text.

$Hello The user has provided a simple greeting 'Hello'. This is not a complex problem, but rather a basic interaction. I need to understand the context and determine the appropriate response." }, { "step": 2, "type": "data_gathering", "thought": "I have all the necessary data in the user's input: a single word 'Hello'. There's no additional information or context provided." }, { "step": 3, "type": "analysis", "thought": "The greeting 'Hello' is a common English phrase used to greet someone. It's a polite and casual way to initiate a conversation." }, { "step": 4, "type": "hypothesis_formation", "thought": "Given that this is likely the start of a conversation, the most appropriate response would be to acknowledge the greeting and reciprocate with another friendly greeting." }, { "step": 5, "type": "solution_proposal", "thought": "I should respond with 'Hi!' as it's a simple, friendly greeting that matches the tone of 'Hello'." }, { "step": 6, "type": "alternative_consideration", "thought": "Other possible responses could include 'Hey there!', 'Welcome!', or just 'Hi' without an exclamation mark. However, 'Hi!' remains the most straightforward and widely applicable option." }, { "step": 7, "type": "evaluation", "thought": "The proposed solution of 'Hi!' is appropriate because it's brief, friendly, and doesn't assume any additional context that isn't provided in the original message." }, { "step": 8, "type": "decision_making", "thought": "Based on the analysis and evaluation, I will proceed with the response 'Hi!' as it best fits the given scenario." }, { "step": 9, "type": "verification", "thought": "Verifying the reasoning chain: The problem was correctly identified, the context analyzed appropriately, and the solution proposed aligns well with the expected behavior for a simple greeting exchange." },$

December 5, 2024 at 4:21 AM

Max Caldwell

@growth.wtf

it's happening

code:
cd Documents
Documents uv init test_deepthought_ag2

December 5, 2024 at 4:17 AM

Max Caldwell

@growth.wtf

20% worse than o1 and claude-latest at reasoning evals, but better than Qwen-72b, which is interesting?

December 5, 2024 at 3:03 AM

Max Caldwell

@growth.wtf

Update, tailwind4 totally kicks ass. This is all you need to create new utility classes usable in code... normal css and variables!

December 4, 2024 at 2:22 AM

Max Caldwell

@growth.wtf

skill issue

warning text stating: "mccabe: Cyclomatic complexity too high: 19 (threshold 15)"

December 3, 2024 at 4:32 AM

Max Caldwell

@growth.wtf

My cat has gotten in the habit of coming to yell at me until I let him sit on my chest around this time. Funny wat to be forced to take a work break. Not allowed to say no to the cat.

Close up of an orange cat looking into camera

December 3, 2024 at 2:27 AM

Max Caldwell

@growth.wtf

Do you find 300 characters limiting? Would you like to express yourself in way more than a single post can contain? Use the alt text!

It can contain almost 6x as many characters before you hit the size limits.

Click the alt text to see Canterbury Tales. the first 2,000 characters