@eggymachus

eggymachus@sh.itjust.works · edit-2 10 months ago

Arrows

Pointless

Pick one

eggymachus@sh.itjust.works · 11 months ago

Definitely possible, but we’ll have to wait for some sort of replication (or lack of) to see, I guess.

eggymachus@sh.itjust.works · 11 months ago

True, but as far as I can tell the AUROC measure they refer to incorporates both.

eggymachus@sh.itjust.works · 11 months ago

What they’re saying, as far as I can tell, is that after training the model on 85% of the dataset, the model predicted whether a participant had an ASD diagnosis (as a binary choice) 100% correctly for the remaining 15%. I don’t think this is unheard of, but I’ll agree that a replication would be nice to eliminate systemic errors. If the images from the ASD and TD sets were taken with different cameras, for instance, that could introduce an invisible difference in the datasets that an AI could converge on. I would expect them to control for stuff like that, though.

eggymachus@sh.itjust.works · 11 months ago

From TFA:

For ASD screening on the test set of images, the AI could pick out the children with an ASD diagnosis with a mean area under the receiver operating characteristic (AUROC) curve of 1.00. AUROC ranges in value from 0 to 1. A model whose predictions are 100% wrong has an AUROC of 0.0; one whose predictions are 100% correct has an AUROC of 1.0, indicating that the AI’s predictions in the current study were 100% correct. There was no notable decrease in the mean AUROC, even when 95% of the least important areas of the image – those not including the optic disc – were removed.

They at least define how they get the 100% value, but I’m not an AIologist so I can’t tell if it is reasonable.

eggymachus@sh.itjust.works · edit-2 11 months ago

Column A: yes

Column B: also yes

eggymachus@sh.itjust.works · 11 months ago

But it has been peer reviewed? And the criteria have been defined?

eggymachus@sh.itjust.works · 11 months ago

The article seems to be published in JAMA network open, and as far as I can tell that publication is peer reviewed?

eggymachus@sh.itjust.works · 1 year ago

The board that fired him was that of the nonprofit, so they don’t answer to shareholders.

eggymachus@sh.itjust.works · 1 year ago

*up bottoms

eggymachus@sh.itjust.works · 1 year ago

Oh, the humanity!

eggymachus@sh.itjust.works · 1 year ago

Yeah, that’s what I did. With my very light usage the fixed-price subscription isn’t justifiable, but the api works nicely.

eggymachus@sh.itjust.works · 1 year ago

Ok, maybe slightly :) but it surprises me that the ability to emulate a basic human is dismissed as “just statistics”, since until a year ago it seemed like an impossible task…

eggymachus@sh.itjust.works · 1 year ago

Absolutely agree that this is a necessary next step!

eggymachus@sh.itjust.works · 1 year ago

Agree, I have definitely fallen for the temptation to say what sounds better, rather than what’s exactly true… Less so in writing, possibly because it’s less of a linear stream.

eggymachus@sh.itjust.works · 1 year ago

Yeah, I was probably a bit too caustic, and there’s more to (A)GI than an LLM can achieve on its own, but I do believe that some, and perhaps a large, part of human consciousness works in a similar manner.

I also think that LLMs can have models of concepts, otherwise they couldn’t do what they do. Probably also of truth and falsity, but perhaps with a lack of external grounding?

eggymachus@sh.itjust.works · 1 year ago

And this tech community is being weirdly luddite over it as well, saying stuff like “it’s only a bunch of statistics predicting what’s best to say next”. Guess what, so are you, sunshine.

eggymachus@sh.itjust.works · 1 year ago

Nice, thanks!

eggymachus@sh.itjust.works · 1 year ago

Ooh, what car is that?