Lightnews — Scholar-powered news

Reposted by (dialupready)

breadedhamster.bsky.social

@breadedhamster.bsky.social

November 14, 2025 at 9:57 PM

(dialupready)

@dialupready.com

do you think Vance read that book before or after marrying his Indian wife and fathering Indian children?

disgusting evil book, the fact that Indian-American right-wingers overlook this shows how self-hating and/or cucked they are

November 14, 2025 at 12:29 AM

(dialupready)

@dialupready.com

the United Kingdom is also doing that

both countries perma-fucked by voting in moderate conservatives (the most effective agents of Satan) with Merkel and Cameron

November 13, 2025 at 11:18 PM

(dialupready)

@dialupready.com

That’s true, it taps out at PS1, but I don’t think there are file permissions issue

November 13, 2025 at 9:17 AM

(dialupready)

@dialupready.com

the Passenger: Not Good

you don’t want to work through that meandering book to get the ending you’ll get, it’s just not worth it

November 13, 2025 at 5:59 AM

(dialupready)

@dialupready.com

no good I tried this it gives OCR the same (bad) quality as the one already in TEXT/, I suspect that was already generated with tesseract

its because of how the images are (censor bars, bad angles, missing text, odd cutoff points)

has to be a AI model for this some news org should find the money

November 13, 2025 at 5:12 AM

(dialupready)

@dialupready.com

30 images took 8 credits through, I can’t do it on grad student money

The number you have to hunt through TEXT/001 has IMAGES/[001-005]

November 13, 2025 at 4:33 AM

(dialupready)

@dialupready.com

there’s a somewhat relation between the IMAGES and the TEXT filenames, the number at the end somewhat corresponds to which few images will be OCR’d in that text.

it’s why i was trying the Deepseek OCR thing, trying to have each IMAGE file and folder have a corresponding text file with better OCR

November 13, 2025 at 4:32 AM

(dialupready)

@dialupready.com

none of the documents are collected into the separate books/documents that they are, there’s no link between anything just random folders

November 13, 2025 at 4:27 AM

(dialupready)

@dialupready.com

if you have GPUs or Colab credits could you run the whole image set through Deepseek OCR? The GOP committee published OCR is really bad and no filename link with respective images

I got it working but 30 images took 8 credits. Maybe it doesn’t need a A100-level model but it does need better OCR.

November 13, 2025 at 4:26 AM

(dialupready)

@dialupready.com

this is an optional ui for retroarch, I don’t use it now since it’s not too happy with touchscreens i just use the default skin

November 13, 2025 at 4:17 AM

(dialupready)

@dialupready.com

no Retroarch is an official App Store listed App now Apple relented with emulation (what’s missing is the BIOS files of the consoles needed for the emulator cores). Just playing Xenogears for now

November 13, 2025 at 1:41 AM

(dialupready)

@dialupready.com

just getting these files was 8/100 compute units in Colab Pro A100s are expensive

November 13, 2025 at 1:00 AM

(dialupready)

@dialupready.com

Nothing was properly documented for this, needed to run inspect on model.infer to stop it from writing bounding boxes (though if people have the storage, they can add <|grounding|> back to the prompt and have save_results=True to save the bounding boxes images and multimarkdown files)

November 13, 2025 at 12:59 AM

(dialupready)

@dialupready.com

I think i have a better OCR here, but don’t have the credits to fully go through it all

bsky.app/profile/dial...

(dialupready) @dialupready.com · 2d

I got a working colab to go through the Oversight drive and OCR to text with Deepseek-OCR. The script text is topaz.github.io/paste/#XQAAA...

you can compare the output for the files in IMAGES/004 in this zip (the cmte’s OCR is under TEXT/001 with similar file names) send.vis.ee/download/cac...

paste

topaz.github.io

November 13, 2025 at 12:54 AM

(dialupready)

@dialupready.com

I got a working colab to go through the Oversight drive and OCR to text with Deepseek-OCR. The script text is topaz.github.io/paste/#XQAAA...

you can compare the output for the files in IMAGES/004 in this zip (the cmte’s OCR is under TEXT/001 with similar file names) send.vis.ee/download/cac...

paste

topaz.github.io

November 13, 2025 at 12:51 AM

(dialupready)

@dialupready.com

the ocr here is pretty bad (from what i saw in the google drive)

maybe only so much you can do

November 12, 2025 at 11:10 PM

(dialupready)

@dialupready.com

I tried for some of the email images I mean, it does arrange the headers and the content in the text output quite well

November 12, 2025 at 11:05 PM

(dialupready)

@dialupready.com

I tried a publicly available deepseek-ocr endpoint www.alphaxiv.org/models/deeps... and the output is quite impressive, but im incompetent, can’t figure out how to get this to work on colab(mounting the oversight cmte google drive shortcut or getting deepseek ocr to work in huggingface transformers)

November 12, 2025 at 11:04 PM

(dialupready)

@dialupready.com

ipad mini

this would be a better experience on Android, the controller (gamesir x5) has a special Android mode where it can control features like the volume and apps get more access to things like the filesystem

November 12, 2025 at 10:57 PM

(dialupready)

@dialupready.com

from the House Oversight Committee page

November 12, 2025 at 9:32 PM

(dialupready)

@dialupready.com

the dump was shared as a public google drive, you can click around there.

Natives is a bunch of xlsx files and Text is a bad attempt at OCR or something

November 12, 2025 at 9:31 PM

(dialupready)

@dialupready.com

is it possible to run all the files through Deepseek-OCR and make them searchable through like Apache Lucene or something? (Like say q:”Summers” shows me the ocr’d texts and links to the images)

I don’t know the cost of doing so

November 12, 2025 at 9:07 PM

(dialupready)

@dialupready.com

I don’t need to read the King in Yellow, I played Signalis and watched True Detective season 1 I think I have the gist

j/k

November 12, 2025 at 9:03 PM

(dialupready)

@dialupready.com

Larry Summers is reported in there, presumably Pinker isn’t too far off

November 12, 2025 at 9:02 PM

Add to Home Screen

Light up
your news

Add to Home Screen

Light upyour news

Sign in to Lightnews

Sign up to start reading

Connect Bluesky

Connect with Bluesky

Light up
your news