Literature Review

Software platforms dedicated to literature search and review

The common theme for popular commercial AI applications that explicitly pitch themselves towards ‘research’ is that they integrate LLM technology with search engine technology to accelerate and, to a limited extent, enhance literature searching and (very limited) reviewing. This reflects the misleading lay perception of research which is often thought of essentially just ‘reading’. There are some literature mapping tools (like Research Rabbit (free), SciSpace (free + paid), Semantic Scholar (free) and Connected Papers (free + paid)) that don’t use LLMs (but use limited semantic NLP) to enhance searches but are nonetheless very useful for building up literature collections. In fact they are arguably better for systematic reviews where time saving is less important than comprehensive coverage.

The most popular tools that use advanced generative AI to support literature review are Elicit, Scite AI and, to a lesser extent (as it’s designed for global searching not just scholarly databases), Perplexity AI. As with any software application, while they fundamentally perform similar functions they each have distinct value propositions and interfaces that will be more useful for some researchers in some contexts. Elicit is perhaps the best starting point to develop not only a collection of relevant papers but succinct AI-generated summaries, key findings and limitations. Scite’s main value is in using a form of sentiment analysis to classify paper citations that either support, contrast or merely mention the source paper, which can speed up evaluating papers as part of the review process. Perplexity is less explicitly dedicated to scholarly research but is extraordinarily fast and useful for finding any live information on the web, along with enabling a continual dialogue based on its findings.

As with all GAI based on LLMs, any generated content must be verified independently as their non-deterministic outputs naturally create ‘hallucinations’, and this can happen even when it looks like the tool is citing a direct source. Scite extracts direct quotes for individual papers rather than producing a GAI summary, which for most academics may be more helpful. At most, these tools can help speed up identifying relevant scholarly resources with snapshot key information summaries, but they do not even begin to substitute engaged direct reading. In fact one of the biggest risks as AI quality improves is researchers outsourcing the ‘reading’ and simply accepting summaries as representative, which not only loses context and therefore information, but risks eroding cognitive capabilities that can only remain sharp with continued engagement.

Elicit

Elicit can help speed up exploratory (but not systematic) literature reviews thanks to using advanced AI semantic search to identify relevant papers without the need for comprehensive or exact keywords. Like any tool that relies on searching data, it is limited by what it can access, and Elicit cannot access scholarly work behind a paywall, which includes many books.

It also includes LLM functionality to extract key information and/or summarise from retrieved articles as well as PDFs the user uploads and present in a table, columns include the title, abstract summary, main findings, limitations etc. As with any LLM, results of such summaries or information retrieval cannot be relied upon and should only be seen as a starting point. The time saving may well be worth it at least for exploring a new research area compared to brute force keyword searching in Google Scholar.

As of 2024, Elicit comes with a limited free option which doesn’t include paper summarisation or exporting. The paid subscription of $12 per user per month includes up to 8 paper summaries. 

Scite AI

Scite is similar to Elicit in terms of LLM-enhanced semantic searching to identify papers (again, limited to the scholarly databases it has access to). While it uses LLM technology to summarise a collection of papers from the search results, rather than being able to summarise key findings, limitations etc. like Elicit, it provides a view with short direct quotations from the paper along with direct links to the paper itself. In some cases this may be more useful than a potentially flawed LLM summary.

The main value offering of Scite is that it incorporates a limited evaluation of the nature of citation statements, to help a researcher get a quicker feel for the extent to which authors who cite a particular paper are supporting or critiquing it:

GAIR_Scite

In reality, the overwhelming majority of citations are neutral and so the benefit is marginal – again, it could well be worth it to help accelerate discovery.

As with Elicit, the value is far higher when exploring a new research area to speed up a researcher’s engagement with the literature. Identifying contrasting citations in particular can be extremely helpful to accelerate critical engagement with past studies or authors. Also, as always this is a useful starting point for a researcher to delve deeper into any other paper to engage with it directly. Even imagining a future with AI brain implants, it’s difficult to imagine how any human can actually engage with academic concepts and research effectively without cognitive effort. But a tool like Elicit can certainly help speed up identification of useful avenues to direct that intensive cognitive effort.

Scite does not offer a free version but it does offer a free trial for 7 days, after which it’s £14.13 per month (as of 2024).

While these tools are great as a starting point especially for a new topic, they cannot be more than a starting point. They simply do not have access to the entire corpus of academic work, and the search results are only as good as the synonyms the LLMs can come up with. Moreover the value added features like summarising or extracting key information from papers is always at risk of inaccuracy so again the tools can be thought of as a starting point. When using Elicit for instance, creating a table of the initial search results focusing on summarising abstract, methodology and limitations can be extremely helpful to help you identify which papers are more likely to be worth investigating further. But this is pure productivity enhancing assistance and cannot ever hope to substitute actually engaging with the texts. In time this may improve and we’re already seeing models like Gemini 1.5 capable of accurate – if simple – recall from up to 1 million tokens of text, but the sheer scale of text required means it would have to be a truly groundbreaking advancement before it can be anything more than a helpful starting point for a literature review.

The more explicit and targeted your prompts when ‘interrogating’ any specific academic text, the better. And this can be done just as well if not better with a dedicated tool like Chat GPT Plus / Team than the commercial literature review applications. Simply asking, “Please summarise the attached paper” will inevitably result in some data loss which may or may not be critical depending on what you’re after. “Please summarise the methodology in the attached paper” (reminder to check publisher policies about whether publications are permitted to be shared with a GAI application) is much more helpful because the scale of text needed to process is much smaller, and that also gives you more tokens to play with for follow up dialogue. The guidance on Prompting includes an example template with 4 stages of reflection and quality improvement for a highly specific task on a given research paper: extracting practical research project ideas based on the paper's content.

A useful way to experiment is by using GPT4 to interrogate with one of your own papers. This is an output you personally spent a lot of time on and you know it inside out, so you can easily verify how good its summaries or extractions are. This can then help you create prompt templates for other papers.