Definitely possible, but we’ll have to wait for some sort of replication (or lack of) to see, I guess.
Definitely possible, but we’ll have to wait for some sort of replication (or lack of) to see, I guess.
True, but as far as I can tell the AUROC measure they refer to incorporates both.
What they’re saying, as far as I can tell, is that after training the model on 85% of the dataset, the model predicted whether a participant had an ASD diagnosis (as a binary choice) 100% correctly for the remaining 15%. I don’t think this is unheard of, but I’ll agree that a replication would be nice to eliminate systemic errors. If the images from the ASD and TD sets were taken with different cameras, for instance, that could introduce an invisible difference in the datasets that an AI could converge on. I would expect them to control for stuff like that, though.
From TFA:
For ASD screening on the test set of images, the AI could pick out the children with an ASD diagnosis with a mean area under the receiver operating characteristic (AUROC) curve of 1.00. AUROC ranges in value from 0 to 1. A model whose predictions are 100% wrong has an AUROC of 0.0; one whose predictions are 100% correct has an AUROC of 1.0, indicating that the AI’s predictions in the current study were 100% correct. There was no notable decrease in the mean AUROC, even when 95% of the least important areas of the image – those not including the optic disc – were removed.
They at least define how they get the 100% value, but I’m not an AIologist so I can’t tell if it is reasonable.
Column A: yes
Column B: also yes
But it has been peer reviewed? And the criteria have been defined?
The article seems to be published in JAMA network open, and as far as I can tell that publication is peer reviewed?
The board that fired him was that of the nonprofit, so they don’t answer to shareholders.
Oh, the humanity!
Yeah, that’s what I did. With my very light usage the fixed-price subscription isn’t justifiable, but the api works nicely.
Ok, maybe slightly :) but it surprises me that the ability to emulate a basic human is dismissed as “just statistics”, since until a year ago it seemed like an impossible task…
Absolutely agree that this is a necessary next step!
Agree, I have definitely fallen for the temptation to say what sounds better, rather than what’s exactly true… Less so in writing, possibly because it’s less of a linear stream.
Yeah, I was probably a bit too caustic, and there’s more to (A)GI than an LLM can achieve on its own, but I do believe that some, and perhaps a large, part of human consciousness works in a similar manner.
I also think that LLMs can have models of concepts, otherwise they couldn’t do what they do. Probably also of truth and falsity, but perhaps with a lack of external grounding?
And this tech community is being weirdly luddite over it as well, saying stuff like “it’s only a bunch of statistics predicting what’s best to say next”. Guess what, so are you, sunshine.
Nice, thanks!
Ooh, what car is that?
Arrows
Pointless
Pick one