Introduction
This project is the final project of EECS 498: Conversational Artificial Intelligence instructed by Professor Jason Mars. Our team developed a conversational lecture assistant that can answer lecture related questions, summarize lecture content, and search for lecture subjects.
Architecture
We utilized the services from the GPT-3 model provided by OpenAI for different NLP task. Google Speech Service was also used for lecture transcription. Additionally, we used the SentenceTransformer model for lecture segmentation. Below is a detailed structure of our system:
Implementation
With GPT-3’s ability of zero-shot/few-shot learning, our system is able to complete multiple NLP using OpenAI’s API. Below are our implementations.
Completion API
The Completion API is powerful and widely used across our system. We formatted different “training” prompts for the GPT-3 model to complete so that we can get the corresponding information. Below are the tasks it completes.
Intent Classification
Our system is able to classify user’s query into 3 categories: Question Answering (0), Summarization (1), and Search (2). Therefore, we provided a few labelled queries to the model so that our sysem can classify incoming questions. Our prompt for the API looks like following:
Question: what is the capital of Germany?
Label: 0
Question: can you summarize lecture 2?
Label: 1
Question: which lecture was about transformer?
Label: 2
…
Question: {user question}
Label:
Information Extraction
After categorizing the query, we need to extract certain information from the query such as lecture number and search subject of the question. For lecture extraction, the prompt looks like:
Question: what was lecture 10 about?
Lecture: 10
Question: what is the capital of Germany in lecture 2.
Lecture: 2
…
Question: {user question}
Lecture:
Question extraction:
Input: what is the capital of Germany in lecture 2?
Question: what is the capital of Germany?
Input: in lecture 3, what is the definition of derivatives?
Question: what is the definition of derivatives?
…
Input: {user question}
Question:
Subject extraction:
Question: which lecture talks about Berlin?
Query: Berlin
Question: which lecture mentioned the World Cup?
Query: the World Cup
…
Question: {user question}
Query:
Text Summarization
For summarization, we exploit the zero-shot capacity of GPT-3, we provide it with the prompt:
Here's what the speaker said:
{text to be summarized}
The main point was that
Answers API
After we obtain the basic form of the question and the lecture number, we are able to use the Answers API from OpenAI to conduct context-based question answering. Even if the user may not provide a lecture number, the system can instead conduct open-domain question answering with the zero-shot ability of GPT-3.
Search API
In the case when the user want to search for a specific subject from all the lectures, we utilized the Search API. The system is able to transcribe user-upload lecture recordings, segment the text using SentenceTransformer, and stores it in the API’s database. Then the system will use the extracted subject entity as the search query for the API.