Automating with APIs

Introduction to APIs

The vast majority of use cases for GAI in research can reasonably be undertaken within a typical chat interface like MS Copilot or Chat GPT. There are tasks that make more sense to be automated however, and a common scenario in social science research is around qualitative coding, where you have a large number of small and similar tasks. It would be extremely onerous to paste these in one by one in a chatbot interface. Fortunately it’s possible to automate such tasks with code by using an API. API stands for ‘Application Programming Interface’, an umbrella term representing ways in which different digital systems translate and share data with each other. In this case, Open AI and Anthropic make their LLMs available via an API, meaning people can use whatever custom code and data they have and request LLM responses from the API in an automated way. Usages is charged per token so you pay based on usage rather than a monthly subscription fee that you get with Chat GPT.

One of the most important settings available in the API but not in the standard Chat GPT interface is the output ‘temperature’. You may have seen switches in Microsoft Copilot for ‘more precise’ and  ‘more creative’ responses. The temperature setting ranges from 0 to 2, with lower values resulting in more deterministic outputs. The default temperature of Chat GPT is 0.7, so that can be considered a useful baseline that works effectively in most cases. For a creative writing task you may want to experiment with a temperature above 1, while for a strict classification task you may want to go as low as 0.1, which (roughly speaking) means only the top 10% of probable next token distributions will be used. Open AI offers a testing area called the ‘Playground’ where you can experiment with different temperature settings.

While Python is the recommended language to use especially for beginners (and especially because it’s GPT4’s ‘best language’ and is extremely good at explaining and teaching code as well as providing it), this section will use a ‘low code’ approach using Microsoft Power Automate, as its visual flowchart interface makes the logic easier to understand. Open AI has a very helpful Quick Start Guide on calling the API, including setting up your secret API keys (for authentication, authorisation and billing purposes) and some template code which is worth exploring, and of course feeding that documentation to GPT4 and asking it to build a tutorial is a great way to get started.

Coding open-ended survey questions

Open-ended questions are usually much more valuable for gaining insights into people’s subjective ideas, compared to pre-determined responses which also risk leading respondents in a direction that may not reflect their actual thoughts. The downside with open ended questions is that if you want to record any kind of summary statistics about the responses, it requires manually interpreting and coding, which takes a lot of time and effort.

LLMs are extremely valuable tools in this regard and qualitative coding is indeed one of the most useful labour saving tasks that they can support. Before the LLM revolution, natural language processing (NLP) required advanced machine learning, coding and data knowledge along with enormous effort to curage ground truth datasets and refining training and testing processes. Now, even the 2nd tier advanced LLMs are capable of classifying texts based on (good quality) natural language prompts, massively democratising previously opaque machine learning tasks.

Here’s an example of an open-ended question which would need to be coded qualitatively (MS Forms is used in this guide): 

GAIR_MS_Form

 

Let’s say based on your own research you’re interested in analysing the following themes, including an all-purpose ‘other’ category:

  • Better Time Management
  • Work-Life Balance
  • Technology and Tools
  • Communication and Collaboration
  • Physical Workspace
  • Flexibility and Autonomy
  • Distractions and Interruptions
  • Mental Health and Well-being
  • Other

For simplicity’s sake given this guide is designed to illustrate the concept of automating qualitative coding, let’s assume that any response can only ever have a single thematic code. It’s of course possible to adapt the prompt to permit multiple or even multidimensional codes, but this requires additional complexity to join and split results in a structured way. An example prompt for GPT might be:

“You are a classification assistant designed to respond only with one of multiple pre-determined thematic codes. Your response can only be one of the permissible codes provided. You should not invent an alternative code, nor should you offer any thoughts or suggestions – the ‘Other’ classification is to be used if you’re unsure of if you encounter an error, I will check those cases myself so there’s no need for you to provide commentary. I will share a survey response relating to a social science study on people’s perceptions around working from home. The goal is to classify the survey response into one, and only one, of the following thematic codes: Better Time Management; Work-Life Balance; Technology and Tools; Communication and Collaboration; Physical Workspace; Flexibility and Autonomy; Distractions and Interruptions; Mental Health and Well-being; Other

Here's an example survey Response: Since working from home, I've found that I'm able to spend more time with my family, which has made me much happier overall.

Your output would be: Work-Life Balance”

 

The instructions may appear excessive but when a model is as advanced as GPT4, it can have a ‘mind of its own’ sometimes and try to be ‘helpful’ by responding in a way that doesn’t conform to instructions. This can be a problem with empty values or an error in your source data, which should always be prepared and cleaned in advance, but with large datasets it’s always possible for errors to creep in. Instructing the model not to provide commentary but simply default to an output which fits the pre-determined format will work well most of the time.

In this example we’ll use an Excel file stored on SharePoint as the cloud-based log to record responses and GPT4’s codes. Power Automate includes connectors to Excel and SharePoint making the integration simple for such a task. Here’s an example Excel table:

GAIR_API_Excel

 

This updates in real time as and when survey responses come in. In reality there would be additional verification columns for human researchers to confirm or correct GPT4’s codes, but the main idea is that you can have a stream of GAI assistance to your qualitative data as it comes in. Here’s what the flow could look like:

Part 1: Retrieving the response information from MS Forms and setting the secret API Key variable:

GAIR_API_Trigger

 

Part 2: Calling the API with the HTTP action and adding the survey question response:

 

GAIR_API_Call_Power_Automate

Part 3: Parse the API response body (which includes lots of metadata we’re not necessarily interested in) to extract the content needed and add it to the Excel table along with the MS Forms data: GAIR_API_Response_Power_Automate

While this simple example used a low code approach with Power Automate as it makes it easier to explain, it’s worth re-emphasising that GPT4 (and similar class LLMs like Claude 3.5 Sonnet) are extremely effective not only at providing working Python code based on your specifications, but also guiding and explaining it to you at whatever level of understanding, whatever level of detail, and based on whatever background you have which can help you learn more effectively. The barrier for entry to learn coding has never been lower thanks to advanced LLMs and it’s worth taking advantage as automating AI can significantly turbo-charge productivity. The one area to avoid is sharing personal data; while the API is what businesses use and is considered private and secure, Open AI still temporarily stores it for 30 days which among other problems could violate GDPR.