AI Helps Determining Which Papers Can Be Replicated

Sunday, May 10, 2020 - 11:35

Researchers have designed an AI model that aims to determine which papers can be replicated.

As reported by Fortune, a new paper published by a team of researchers from the Kellog School of Management and the Institute of Complex Systems at Northwestern University presents a deep learning model that can potentially determine which studies are likely to be reproducible, and which studies are not.

If the AI system can reliably discriminate between reproducible and non-reproducible studies, it could help universities, research institutes, companies, and other entities filter through thousands of research papers to determine which papers are most likely to be useful and reliable.

The AI systems developed by the Northwestern team doesn’t utilize the type of empirical/statistical evidence that researchers typically use to ascertain the validity of studies. The model actually employs natural language processing techniques to try and quantify the reliability of a paper. The system extracts patterns in the language used by the authors of a paper, finding that some word patterns indicate greater reliability than others.

Researcher and Kellog School of Management Professor Brian Uzzi, explained to Fortune that while he is hopeful that the AI model could someday be used to help researchers ascertain how likely results are to be reproduced, the research team is unsure of the patterns and details their model learned. The fact that machine learning models are often black boxes is a common problem within AI research, but this fact could make other scientists hesitant to utilize the model.

“We want to begin to apply this to the COVID issue—an issue right now where a lot of things are becoming lax, and we need to build on a very strong foundation of prior work. It’s unclear what prior work is going to be replicated or not and we don’t have time for replications.”

Uzzi and the other researchers are hoping to improve the model by making use of further natural language processing techniques, including techniques that the team created to analyze call transcripts regarding corporate earnings. The research team has already built a database of approximately 30,000 call transcripts that they will analyze for clues. If the team can build a successful model, they might be able to convince analysts and investors to use the tool, which could pave the way for other innovative uses for the model and its techniques.


Popular News

Latest News