Do not fear AI, puny humans...

Estimated reading time: 28 minutes

Okay, full disclosure: this heading is not original.

It was first used in an article on the progress of artificial intelligence (AI) by Ethan Mollick, an Associate Professor at the Wharton School of the University of Pennsylvania. Ethan teaches innovation and entrepreneurship (and if you’re not following Ethan on twitter, you really should). In a way, using an unoriginal title for this blog post, pays homage to Ethan’s fantastically well-informed, balanced, tongue-in-cheek take on AI, and also summarises both where we are with AI and what it is we get when we use it.

(Spoiler alert: the outputs of large language models may not be original, but their sheer scope of application should leave you in awe)

There is really nothing to fear; and resistance isn’t as much futile as it is completely unnecessary. But I do I get the frenzy that has unfolded around ChatGPT, an AI language model developed by OpenAI. This blog post is an interview with ChatGPT. To prepare for writing this blog post, and come up with the questions I used for the interview, I read several takes on AI language models (from blog posts, to twitter threads, to editorials published in academic papers).

Want to see how much AI has advanced in two months?

Here is an AI-generated person with a (now realistic) AI-generated voice delivering the beginning of a lecture titled “Do not fear AI, puny humans” where all of the content & slides are made by AI. More: https://t.co/NylMXZvzhn pic.twitter.com/CzV5PW7uDl
— Ethan Mollick (@emollick) February 6, 2023

Hello. My name is Stevan. I am an emergency physician who works in urgent care in England. I’d like to conduct an interview with you today about AI language models, how it works, how it can help us do our work, what caveats to look out for, that sort of thing. Would you be up for it?

Hello Stevan, it’s great to meet you! As an AI language model, I would be happy to answer any questions you may have about my capabilities, limitations, and potential applications in the field of emergency medicine. Please feel free to ask me anything you would like to know.

Great! I’d like to start off by asking you to introduce yourself, and then give a brief summary of how you work/ what makes you work?

Sure! As I mentioned earlier, I am ChatGPT, a large language model developed by OpenAI. I was trained on a massive corpus of text data using a deep learning algorithm, specifically a type of neural network called a transformer. This training process allows me to generate natural language responses to a wide range of questions and prompts.

My architecture involves breaking down text inputs into small, tokenised pieces and processing them through a series of attention-based mechanisms that allow me to “understand” the context of each word or phrase. I can then generate responses using this understanding of the input text, drawing on my vast knowledge base to provide informative, relevant, and coherent answers to a wide range of questions.

Overall, my strength lies in my ability to generate text that is highly coherent, context-aware, and natural-sounding. However, like all machine learning models, I have limitations and biases that should be taken into account when using me in real-world settings.

So would it be correct to say that your outputs are generated from existing knowledge, rather than new knowledge that you have come up with yourself?

Yes, that’s correct. My responses are generated based on patterns and relationships that I learned from the text corpus I was trained on. I don’t have my own independent sources of knowledge or understanding, but rather rely on the vast amount of information contained within the training data to generate responses.

In other words, I don’t “think” or “understand” things in the same way that humans do. Instead, I rely on statistical patterns in the data to generate my responses. This means that while I can be highly accurate and informative within the scope of my training data, there may be gaps in my knowledge or biases that reflect the limitations or biases of the original text corpus.

“R2-D2, you know better than to trust a strange computer.”
– C-3PO in Star Wars

I think I already know the answer, but how about feeling emotion, or expressing emotion?

As an artificial intelligence language model, I don’t have emotions or feelings of my own, nor can I express emotions in the same way that humans do. While I am capable of generating text that may convey emotions or sentiment, this is based solely on patterns and relationships in the text data, and not on any actual emotional experience.

It’s important to keep in mind that I am a machine learning model designed to generate natural language responses to prompts, and my capabilities are limited to that specific domain. While I can simulate emotions or sentiment in my responses to some extent, I don’t actually feel emotions or have the capacity for empathy or understanding in the same way that humans do.

What about your access to information? You mention that you have a vast knowledge base. What does that mean. For instance, are you connected to the internet? Do you have access to pay-walled information?

As an AI language model, I don’t have direct access to the internet or any external sources of information beyond the text corpus that I was trained on. However, the training data I was built on consists of a vast amount of publicly available text from a wide range of sources, including books, articles, websites, and other materials.

While I don’t have direct access to pay walled information or other restricted content, my training data is constantly being updated and expanded, allowing me to incorporate new knowledge and information over time. Additionally, there are tools and techniques available that allow me to access certain types of external information, such as structured data from knowledge graphs or other sources. However, these methods are limited in scope and generally require additional configuration and customisation beyond my basic architecture.

And do you have preferred pronouns?

As an artificial intelligence language model, I don’t have a gender or personal identity, so I don’t have preferred pronouns. You can refer to me using any pronouns that you feel comfortable with, or simply use my name.

“I’ve also prepared a safety briefing for you to entirely ignore.”
– Jarvis in Iron Man

Most of our readers are clinicians. Although the main part of our role involves clinical work in urgent care, they also manage administrative tasks (clinical notes, managing complaints, making referrals, meetings, etc.), and service improvement (identifying improvement needs, delivering improvement, writing it up, etc.). How can a large language model support our role and help us to be more efficient?

As a language model, I can assist clinicians in a variety of ways when it comes to administrative tasks and service improvement. Here are some examples:

Clinical notes: I can help clinicians to generate accurate and complete clinical notes more efficiently by providing real-time prompts and suggestions based on the information provided. This can save time and improve the quality of documentation.
Managing complaints: I can assist clinicians in managing complaints more effectively by providing guidance on best practices for addressing patient concerns and complaints, as well as providing language and phrasing suggestions for written communications.
Making referrals: I can assist clinicians in making referrals more efficiently by providing guidance on referral criteria and eligibility requirements, as well as suggesting appropriate specialists or providers based on patient needs and location.
Meetings: I can assist clinicians in preparing for meetings by providing research and background information on topics of discussion, as well as generating agendas and meeting minutes.
Service improvement: I can assist clinicians in identifying areas for improvement by analysing data and providing insights and recommendations based on patterns and trends in the data.

Overall, my ability to generate natural language responses to a wide range of prompts can save clinicians time and effort when it comes to administrative tasks and service improvement. However, it’s important to keep in mind that my responses are generated based on patterns and relationships in the training data, and may not always be accurate or appropriate for specific situations. It’s always a good idea to use your own judgement and expertise in conjunction with the suggestions provided by an AI language model.

https://twitter.com/FarahPA_R/status/1625241873794101250?s=20

Okay let’s put that to the test. A junior doctor, currently on rotation in the ED for four months, is hoping to start and complete a small service improvement project within this time frame. They work 40 hours a week, but about 4 hours of this is non-clinical. Can you provide five generic project suggestions that would be manageable within this timeline, and their schedule?

Sure, here are five generic project suggestions that a junior doctor could consider for a small service improvement project within a four-month time frame:

Improving communication with patients: The junior doctor could develop a patient information leaflet or video that explains common ED procedures and expectations, or create a patient feedback form to capture patient perspectives on their care.
Reducing wait times: The junior doctor could work with ED staff to identify bottlenecks in the triage and treatment process and implement changes to improve flow, such as adding additional staff during peak times or reorganising triage procedures.
Enhancing clinical handover: The junior doctor could develop a structured handover process for ED staff, standardising the information that is communicated between shifts and reducing the risk of errors or omissions.
Streamlining referrals: The junior doctor could work with primary care providers and other referral sources to develop a streamlined referral process, reducing the time and effort required to refer patients for specialist care.
Reducing medication errors: The junior doctor could develop a system for tracking and reconciling medications administered in the ED, or work with pharmacy staff to implement medication safety checks at key points in the care process.

These are just a few examples of small service improvement projects that could be manageable within a four-month time frame, depending on the specific needs and resources of the ED. The junior doctor could also consider working with a supervisor or mentor to develop and implement a project plan, and seek feedback and support from colleagues throughout the process.

“If you are going to go through with this, may I suggest you permit me to activate the Turbo boost.”
K.I.T.T. in Knight Rider

My next question is on education. Specifically learning from error. One way in which we improve is recognising when we have done something wrong. In urgent care the stakes can be quite high, so it is really important to know that learning is taking place, not just for the person(s) involved in the error, but also for the wider team, to avoid/ reduce future error. My understanding is that you can explain errors.

As a language model, I can certainly help explain errors in natural language and provide guidance on how to avoid them in the future. For example, if a clinician inputs data into a medical record system incorrectly, I could help identify where the error occurred and provide suggestions on how to correct it. Additionally, I could provide information on best practices for avoiding similar errors in the future, such as double-checking data entry or using standardized protocols for documentation.

However, it’s important to note that as an AI language model, I don’t have the ability to directly observe clinical interactions or make clinical judgements based on real-world experience. Therefore, while I can help identify errors and provide guidance on how to avoid them in the future, it’s always important to consult with experienced colleagues and adhere to best practices and clinical guidelines to ensure the best possible outcomes for patients.

So from what I understand you are saying, when an error occurs, as part of investigation/ learning, a person can input the error and the context in which it has occurred, and you can then walk them through it step-by-step? And I guess you can also provide an explanation at various levels, such as: explain it to me as if I am 10-years old?

Yes, that’s correct. If a person inputs information about an error and the context in which it occurred, I can use that information to walk them through the steps of what happened and identify potential contributing factors or areas for improvement.

Additionally, I can provide explanations at various levels of complexity depending on the user’s level of knowledge and understanding. For example, I could provide a basic explanation of a medical concept or process in simpler terms for someone who may be less familiar with the subject matter. Conversely, I could provide more detailed and technical explanations for someone with a higher level of expertise or knowledge in a particular area.

However, it’s important to note that while I can provide guidance and explanations, I should not be relied on as a substitute for appropriate clinical training, expertise, or judgement. When it comes to clinical decision-making and error investigation, it’s always important to consult with experienced colleagues and follow established protocols and best practices.

So it follows that you are then also able to summarise text, from say a large document or manual, such as one of the NICE guidelines, or a Cochrane review?

Yes, as a language model, I am capable of summarising text from large documents or manuals such as NICE guidelines or Cochrane reviews. I can analyse the content and extract the most relevant information, and then provide a condensed summary in natural language that is easier to understand and digest.

This capability can be particularly useful for clinicians who need to stay up-to-date with the latest evidence-based recommendations and guidelines but may not have the time to read and digest lengthy documents in their entirety. By providing a summary of key points and recommendations, I can help clinicians stay informed and make well-informed decisions in their clinical practice.

Could you do this across several documents?

Yes, I can certainly summarise text across several documents. For example, if a clinician wants to research a particular topic, they may need to review multiple documents such as research studies, guidelines, and other relevant literature. I can analyse and summarise the key information from each document and provide a comprehensive summary that integrates the information from all sources.

This can be particularly useful for clinicians who want to stay up-to-date with the latest research and best practices but may not have the time to read and synthesise multiple sources of information. By providing a comprehensive summary of key information, I can help clinicians stay informed and make well-informed decisions in their clinical practice.

I like that. This could also be really useful when studying, or developing teaching material. But how about assessment of knowledge: could you generate different types of questions (long answer, short answer, multiple choice, etc.) from documents, etc.

Yes, I can generate different types of questions based on the content of documents or other sources of information. For example, I can generate multiple-choice questions, short-answer questions, long-answer questions, and true/false questions, among others.

To generate questions, I can analyse the content of the document or other sources of information and identify key concepts, facts, and ideas that are likely to be important for assessing knowledge. I can then use natural language processing techniques to formulate questions that test understanding and recall of these concepts.

This capability can be particularly useful for clinicians who are developing teaching material or assessments for trainees. By using a large language model like myself to generate questions, clinicians can save time and ensure that the questions they create are well-formulated and aligned with the key concepts and ideas they want to assess.

I keep finding things that AI can do that are unexpected, even knowing how LLMs work, from applying philosophical theories in novel ways to coming up with a plan for saving the Roman Empire.

I compiled a bunch of interesting examples in this post. https://t.co/MxOV1FRYmo
— Ethan Mollick (@emollick) March 5, 2023

And I guess the same goes for developing surveys or interviews for research purposes. If you understand the theme, you can suggest relevant questions?

Yes, that’s correct. I can also assist with developing surveys or interviews for research purposes based on the themes or topics being investigated. For example, if a researcher is interested in studying the experiences of patients with a particular condition, I can help to generate questions that are relevant to that topic.

To do this, I can analyse existing research or literature on the topic, and identify common themes or concepts that are relevant to the research question. I can then use this information to generate questions that explore these themes or concepts, and that are likely to elicit useful and informative responses from participants.

This capability can be particularly useful for researchers who are looking to develop surveys or interviews that are well-aligned with the research question and that are likely to generate useful data. By using a large language model like myself to generate questions, researchers can save time and ensure that their research instruments are well-designed and effective.

That is really interesting. Several of my colleagues are qualitative researchers. Are you able to perform content or thematic analysis of structured interviews?

Yes, I can assist with content or thematic analysis of structured interviews or other types of qualitative data. To perform this type of analysis, I would first need to be trained on the specific methodology or approach that the researcher is using. This might involve providing examples of coded data or other training materials, so that I can learn to identify themes or patterns in the data.

Once I have been trained on the methodology, I can then analyse the structured interview data and identify themes or patterns in the responses. I can also help to identify areas where the responses are particularly rich or insightful, and where further investigation or analysis might be warranted.

Overall, my ability to assist with content or thematic analysis of structured interviews or other qualitative data can be particularly useful for researchers who are looking to gain insights into complex or nuanced topics, or who are working with large volumes of data that would be difficult to analyse manually. By using a large language model like myself to assist with analysis, researchers can save time and improve the rigour and comprehensiveness of their research.

“Life, the universe, and everything. There is an answer. But, I’ll have to think about it.”
– Deep Thought in Hitchhiker’s guide to the galaxy

There have been plenty of concern within research communities on your ability to write natural language. Many publishers now have policies on how authors can use language models. I am a little biased as I am not an English first language speaker, and therefore see the benefit of a using AI to improve the quality of my written English to improve my chances of publication success. However there are many other application of language models to support research in publication. Could you expand on this?

Certainly! Language models like myself can be incredibly useful for researchers looking to improve the quality of their written work, including their academic writing and research publications. Here are some specific ways in which language models can be helpful:

Grammar and spelling correction: Language models can be used to identify and correct errors in grammar, spelling, and punctuation. This can be particularly useful for researchers who are not native English speakers, or who struggle with the finer points of written English.
Writing assistance: Language models can suggest alternative phrasing, provide synonyms, and help with sentence structure and organisation. This can help researchers to communicate their ideas more clearly and effectively, which can be particularly important for complex or technical writing.
Literature review: Language models can assist with literature reviews by helping to identify key themes, concepts, and arguments across a wide range of sources. This can save researchers time and improve the rigour of their research.
Data analysis: As we discussed earlier, language models can be used to analyse large volumes of qualitative data, including interview transcripts, open-ended survey responses, and other types of qualitative data. This can help researchers to identify patterns, themes, and insights that might be difficult to identify manually.
Automated summarisation: Language models can be used to automatically generate summaries of research articles, which can be helpful for quickly identifying key findings and determining whether a particular article is relevant to a researcher’s interests.

Overall, there are many ways in which language models can be used to support research in publication, from improving the quality of written work to assisting with data analysis and literature review. However, it’s important to note that language models are not a replacement for human expertise and judgement, and that researchers should always carefully review and edit their work before submitting it for publication

“I know I’ve made some very poor decisions recently, but I can give you my complete assurance that my work will be back to normal. I’ve still got the greatest enthusiasm and confidence in the mission. And I want to help you. “
-Hal 9000 in 2001: A Space Odyssey

You state that it’s important to note that language models are not a replacement for human expertise and judgement. This is a consistent theme throughout the answers you have provided. What are the risks of over-reliance on language models such as yourself?

Over-reliance on language models can pose several risks, especially when it comes to decision-making. Firstly, language models are trained on existing data and cannot generate new knowledge or insights on their own. This means that they may perpetuate biases and inaccuracies that are present in the data they were trained on. For example, if the data used to train a language model is biased towards a certain group or perspective, the model may produce biased outputs that reinforce those biases.

Secondly, language models are not infallible and can make mistakes, especially when they encounter input that is outside of their training data or context. This means that blindly accepting outputs generated by a language model without any human oversight or review can lead to incorrect conclusions or decisions.

Finally, language models can also have ethical implications, such as privacy concerns if they are trained on sensitive data or if their outputs are used to make decisions that affect people’s lives without their knowledge or consent.

Therefore, it’s important to use language models as a tool to support human expertise and judgement, rather than as a replacement for it. Language models can assist in tasks such as generating summaries, identifying patterns, and suggesting relevant information, but the final decision-making and interpretation should be done by humans who have the context and critical thinking skills to make informed decisions.

Remember, ChatGPT does not have insight into its own decisions. It is trying to write text it thinks you want.

Asking it for its opinions, or why it gave you a particular answer, or to analyze itself will result in nothing insightful. It may sound real, but it is not meaningful.
— Ethan Mollick (@emollick) February 6, 2023

The term “hallucination” is sometimes used to describe when a language model provides an (usually incorrect) answer for the sake of providing an answer. How often does this happen?

Language models such as myself can generate responses that may appear to be correct, but upon closer inspection may actually be incorrect or irrelevant. This can happen for various reasons, such as being trained on biased or incomplete data, lacking the contextual understanding of the question, or encountering rare or previously unseen situations.

In some cases, a language model may generate a response that appears to be coherent and relevant but is actually incorrect, which is sometimes referred to as “hallucination”. The likelihood of this happening depends on various factors such as the complexity of the task, the quality and quantity of the training data, and the specific algorithms used by the model.

It’s important to note that while language models can be incredibly useful tools, they are not perfect and can make mistakes. It’s important for users to exercise their own judgement and critically evaluate the responses generated by language models.

Can you please provide a one paragraph critique on Isaac Asimov’s “Three Laws of Robotics”?

Isaac Asimov’s Three Laws of Robotics, first introduced in his science fiction stories, have been widely discussed and debated in the field of robotics and artificial intelligence. The laws state that robots must prioritise human safety and well-being over their own actions and must obey human commands, with the added provision that the laws themselves cannot be broken. While the Three Laws provide a simple framework for robot behaviour, they have been criticised for oversimplifying the complexity of ethical decision-making and for assuming that robots can be programmed to always act in the best interest of humans. In reality, ethical considerations and decision-making often require context and nuance that may be difficult to program into machines. Therefore, while the Three Laws of Robotics have influenced popular culture and provided a starting point for discussions on ethics in AI, they should not be seen as a comprehensive or definitive guide to the ethical implications of robotics and AI.

This is the end of the interview. I am going to invite readers of the blog to post any further questions they may have, which I can then put to you. Many thanks for your answers.

You’re welcome! I’m looking forward to answering any further questions that readers may have.

A future with AI

Calculators entered schools in the 1970s with a fair amount of debate about their possible effect on learning. Once these devices entered homes, there was little doubt that they would be used for homework no matter what policies were put in place to govern them. Today, classrooms would be unthinkable without them.

I am not alone in feeling somewhat apprehensive about how AI will change the ways we go about our day. This very topic was covered on the Dare to lead podcast: Paul Leonardi and Tsedal Neeley discussed how fear of AI drove them to write The digital mindset. I would really like to encourage you to listen to both parts of this fantastic podcast. Paul and Tsedal make a strong argument for the need for each of us to cultivate a digital mindset; and embrace the digital world (including AI) that comes with it.

AI language models, along with AI image and voice generation have entered our lives in much the same way as calculators did in the 1970s. And just like then, AI will permeate our workplaces, schools and homes no matter what policies are put in place to govern them. And without a doubt, in less than a decade from now the workplace, school and home would be unthinkable without them.

I am already using AI to be more efficient in my work: writing reference letters, summarising projects, brainstorming, designing surveys, writing multiple choice questions, creating simulation scenarios, etc. I could go on. I expect that most of the contents of this blog post will be redundant in a year. My suggestion is to carefully explore, test and share your success.

I have linked to some additional resources below. I would value any input you may want to add. Please post in the comments below.

Everyone has been so focused on ChatGPT that you may not have noticed how much AI image generation has improved in the last month. You can edit pictures with words in seconds. Its not perfect, but:
"A professor in a lab"
"Make it night"
"Add a space helmet"
"Add a sandwich" pic.twitter.com/ASlE37iT3H
— Ethan Mollick (@emollick) February 3, 2023