Simulations

If there’s one area advanced LLMs excel at, it’s in producing plausible looking text, albeit without any assurance of accuracy. So it should be no surprise that they can be very effective in creating realistic synthetic data for any given scenario, which can be valuable for researchers in the planning and design stages of a new project, or even questions or texts to prime research subjects on different scenarios. The Bail (2023: 2-5) article, Can Generative AI Improve Social Science?, goes into some depth on this idea of synthetic data for simulation. A core benefit is LLMs’ ability to create detailed, realistic simulations of social interactions, economic models or cultural dynamics without the logistical and ethical constraints associated with the need to convince real humans to participate. An April 2024 paper by Manning et al goes even further to demonstrate that conversation simulations when controlled using structural causal models fit well with existing theory, better than when being directly prompted about it, which suggests that GPT4 “knows more than it can tell” in terms of underlying human behaviours. A barrier is how expensive that can be – the authors’ simplistic examples cost over $1000 over a few hours of simulations, and adding more independent variables to a given model would exponentially increase the number of scenarios to be covered. But the idea certainly has potential for ‘dress rehearsals’ to test out early hypotheses and will likely become more common as API prices continue to drop.

Below is a simple example of a simulated focus group discussion on the topic of science communication in news media, as a preparatory stage to help the researcher pre-empt and navigate challenges, refine the conversation flow and consider follow-up questions for discussions that may be uncomfortable or simply unexpected. It’s straightforward to ask GPT4 to generate responses in a simulated conversation using pre-defined personas that you’re interested in, and the speed at which these conversations can be generated means it’s a low effort way to experiment and adapt and repeat before setting up the ‘real thing’.

Here's an example initial prompt:

Prompt:

You are an advanced AI assistant tasked with generating synthetic data for a social science research project. The research aims to study public perceptions and engagement with science communication in news media across different demographic groups. Please create a simulated focus group discussion that reflects realistic responses and interactions among diverse participants.

Generate a conversation amongst six fictional participants with the following profiles:

1. Sarah, 35, science journalist

2. Mark, 52, factory worker with limited formal education

3. Amina, 28, graduate student in environmental science

4. Deborah, 67, retired schoolteacher

5. Magdalena, 19, university student majoring in liberal arts

6. Yang, 41, IT professional and parent of two young children

The focus group moderator will introduce topics and ask questions. Please simulate their responses, ensuring they reflect the participants' backgrounds, likely knowledge levels, and potential biases. Include realistic group dynamics, such as agreements, disagreements, and building upon each other's points.

Topics to cover in the discussion:

1. Trust in science reporting in mainstream media

2. Challenges in understanding complex scientific concepts

3. The role of social media in spreading scientific information

4. Personal experiences with misleading or conflicting scientific news

5. Suggestions for improving science communication to the public

Begin the simulation with the moderator's introduction and first question about trust in science reporting. Continue the conversation for at least three rounds of responses from each participant, ensuring a natural flow of discussion.

This synthetic data should help researchers refine their approach, anticipate potential issues, and develop strategies for conducting effective focus groups on science communication in news media.

Response:

Moderator: Good evening everyone, and thank you for joining our focus group on science communication in the media. I'm excited to hear your thoughts and experiences. Let's start with our first topic: trust in science reporting in mainstream media. How much do you trust the science news you encounter in mainstream media sources, and why? Sarah, as a science journalist, would you like to begin?

Sarah: Sure, I'd be happy to start. As someone who works in this field, I have a nuanced view. I believe that many mainstream media outlets genuinely try to report science accurately, but they're often constrained by tight deadlines, limited space, and the need to make complex topics accessible to a general audience. This can sometimes lead to oversimplification or sensationalism. I trust reputable sources that have dedicated science journalists, but I'm always aware that even good reporting may not capture the full complexity of scientific research.

Mark: Well, I gotta say, I don't trust much of what I see in the news these days, including science stuff. It seems like they're always changing their minds. One day coffee's bad for you, the next day it's good. How are we supposed to know what to believe? I tend to trust what I can see with my own eyes more than what some “expert” on TV tells me.

Amina: I understand Mark's frustration, but I have a different perspective. As a graduate student in environmental science, I see firsthand how complex and nuanced scientific research can be. I generally trust mainstream media outlets that have a good track record of science reporting, like Nature, Scientific American, or BBC Science. However, I always try to go back to the original research papers when possible. I think the main issue is not so much trust in the media, but in how scientific uncertainty and the scientific process are communicated to the public.

Deborah: I agree with Amina about the importance of how science is communicated. In my years as a teacher, I saw how challenging it can be to explain complex concepts to those without a scientific background. I tend to trust established news sources, especially those that explain their methods and sources. But I'm also aware that science is always evolving, and what we know today might change tomorrow. That's not a flaw in science; it's how science progresses.

Magdalena: As a university student, I'm kind of caught between trusting and being skeptical. I've been taught to think critically about sources in my classes, so I try to apply that to science news too. I often find myself cross-referencing different sources or checking fact-checking websites. Social media has made me more cautious because I see how easily misinformation can spread. I trust mainstream media more than random posts online, but I still try to verify important information.

Yang: From my perspective in IT, I see parallels with how tech news is reported versus the realities of the field. I'm generally trusting of mainstream science reporting, especially from tech-focused outlets that I'm familiar with. However, I'm always aware that there might be more to the story. As a parent, I'm particularly interested in health and child development news, and I've learned to look for consensus among multiple reputable sources rather than reacting to single studies or headlines.

Moderator: Thank you all for those insightful responses. It's clear that trust in science reporting varies based on personal experiences and backgrounds. Let's dig a little deeper into this. Can you share any specific experiences that have shaped your level of trust in science reporting? Mark, you mentioned the changing nature of health advice. Could you elaborate on that?

Mark: Yeah, sure. It's not just coffee. Remember when eggs were supposed to be bad for you? Now they're saying they're good. Or look at all the back-and-forth about masks during the pandemic. First, they said don't wear them, then we had to wear them everywhere. It just seems like they don't really know, but they pretend they do. Makes it hard to take any of it seriously.

Sarah: If I may jump in here, I think Mark's point illustrates one of the biggest challenges in science communication. The scientific process is inherently about refining our understanding over time, but this can come across as flip-flopping to the public. In the case of eggs, for example, new research methods and larger studies have given us a more nuanced understanding of their health impacts. With masks, the initial advice was partly due to supply concerns for healthcare workers, and partly because we were still learning about how the virus spread. As a journalist, I try to convey this evolving nature of science, but it's not always easy in a headline or short article.

Amina: I agree with Sarah. In my field of environmental science, we often deal with complex systems where new data can significantly change our understanding. The challenge is communicating this uncertainty without undermining trust in the scientific process. I think mainstream media could do a better job of explaining the context of new findings and how they fit into the broader body of knowledge.

Deborah: This reminds me of how we taught the scientific method in school. We emphasized that science is a process of constant questioning and refinement. Perhaps we need to do a better job of instilling this understanding from an early age. It's not about absolute truths, but about our best current understanding based on evidence.

Magdalena: I find this really interesting because in my liberal arts classes, we often discuss how knowledge is constructed and how it changes over time. It seems like there's a gap between how scientists and academics understand this process and how it's perceived by the general public. Maybe media could focus more on explaining the process of science rather than just the results?

Yang: As someone who often has to explain complex tech concepts to non-tech people, I can relate to this challenge. In my experience, people appreciate when you're upfront about uncertainties and limitations. Maybe science reporting could benefit from more of this approach - acknowledging what we don't know alongside what we do know.

Moderator: These are excellent points about the challenges of communicating evolving scientific knowledge. Let's shift our focus slightly to discuss the challenges in understanding complex scientific concepts. What difficulties have you encountered in understanding science news, and how do you think these challenges could be addressed? Amina, given your scientific background, perhaps you could start us off with your perspective on this?

Amina: I sometimes struggle with concepts outside my specific field. The main challenge I see is the use of jargon and technical terms without adequate explanation. Scientific papers are written for other experts, but when these concepts are translated to news articles, there's often a lack of context or simplification that can lead to misunderstanding. I think one solution could be to use more analogies and real-world examples to explain complex ideas. Also, interactive graphics or animations can be really helpful in visualising difficult concepts.

Mark: I'll be honest, a lot of times I just tune out when the news starts talking about scientific stuff. It's like they're speaking another language. I wish they'd break it down more, you know? Use everyday examples that relate to regular people's lives. And maybe explain why I should care about this new discovery or whatever.

Sarah: Mark's point is crucial, and it's something we grapple with constantly in science journalism. There's always a tension between accuracy and accessibility. We need to simplify without distorting the facts, and we need to make it relevant without sensationalising. One approach I've found effective is to start with the "so what" - explaining the potential impact or relevance of the science before diving into the details.

Deborah: In my teaching experience, I found that students often struggle with abstract concepts. They need concrete examples and hands-on experiences. Perhaps science news could incorporate more interactive elements or suggest simple at-home experiments to illustrate concepts. Also, I think it's important to build scientific literacy over time, starting from a young age.

Magdalena: I agree about the importance of scientific literacy. In my experience, a lot of my peers shy away from science news because they feel like they don't have the background to understand it. Maybe there could be more 'Science 101' type content that breaks down fundamental concepts? Also, I find video explanations on platforms like YouTube really helpful for understanding complex ideas.

Yang: From a tech perspective, I see potential in using AI and machine learning to personalize science explanations. Imagine if you could input your background knowledge and interests, and an AI could tailor the explanation to your level and make it relevant to your life. Of course, this would need to be done carefully to avoid creating echo chambers or oversimplifying.

There’s won't be anything surprising in the above but just like with the guidance on brainstorming, the sheer speed at which GPT4 can generate these ideas offers a useful combination of low effort and high value in terms of uncovering any potential issues that you might not have considered initially. It can be particularly helpful to simulate delicate topics where participants may have more extreme viewpoints, which could help pre-empt uncomfortable situations in advance of the live interactions with human participants.