Sources of papers to evaluate:

This is a list of Ilya's "read these papers to get up to speed in ML"

https://arc.net/folder/D0472A20-9C20-4D3F-B145-D2865C0A9FEE

It would also be interesting to check some research that's come out from tobacco / alcohol companies trying to give plausible deniability of their dangers

Yeah food/health stuff seems ripe. Also for underpowered studies or proxy designs

Ah that's a great point – underpowered studies, inappropriate statistical techniques, etc. seem likely to be a rich vein. Though I'd definitely need qualified help to validate the evaluations.

Nathan Labenz:

see @FutureHouseSF’s “contradiction detection” in PaperQA for inspiration - predates these models but asking very similar question

Ted Suzman:

Could also filter on significance: both (a) the model's idea of whether the mistake changes the results significantly, and (b) citation count on the paper

Zac Crippen:

I think medical journal articles could be a great test subject for this approach!