How to Chat with your Data using Mantium
In this tutorial, we'll guide you through the process of connecting your data in Mantium to OpenAI's ChatGPT Plugin. By doing so, you'll be able to interact with your data directly within the ChatGPT interface using the Mantium Plugin. Let's explore the steps to achieve this integration.
Introduction
Using OpenAI plugins, you can access up-to-date information to enhance the capabilities of the Large Language Models. By connecting to these extensions, you can integrate with ChatGPT's core system for your own specific use case. With Mantium Plugin Wizard, you can easily build custom plugins for your own specific use case by leveraging Mantium's data pipeline to bring your own data into ChatGPT.
This document will provide a step-by-step guide on how to achieve it. We will focus on creating a dataset in Mantium, setting up plug-ins, and then chatting with your data.
Objective
Our goal is to use the PDF Data Connector in Mantium's platform to import PDFs from selected ArXiv papers, extract the text, set up plugins, and then query the documents to answer questions, generate content, and provide summarization—all within ChatGPT.
Video
We understand that sometimes it's easier to learn by watching rather than reading. If you prefer a more visual explanation, feel free to check out our accompanying video tutorial below. If you prefer reading or are unable to watch the video, please continue with the text documentation.
Prerequisites
Take a few moments to set up the API key for OpenAI.
Download Dataset
To follow along with this tutorial, download the following papers from https://arxiv.org/ .
Import Data from ArXiv Using the PDF Data Connector
- Navigate to the Data Sources section by clicking
Data Source
on the left navigation bar. - Click
Add Data Source
, and select the PDF Data Connector from the Data Sources list. - Provide the information to label the Data Source, and click Save and Test.
- On the next page, the sync job will start automatically. If it doesn't, click on “Manual Sync” at the top right corner, to perform the initial sync.
- Wait a few moments for the sync to be completed, and navigate to
Files
to upload your papers in.pdf
format. - Click on the
Finish and Sync
button to complete the upload process. - At this point, we have successfully imported PDFs of ArXiv papers using the PDF Data Connector.
Create New Dataset
Datasets serve as the central workspace where you can apply transformations and enrichments to data retrieved from various sources, enabling you to modify and analyze the data without impacting the original information.
To create a new dataset:
- After the sync is completed, Click on the
Create Custom Datasets
button in the Data Source section. - Alternatively, you can create datasets by navigating to the Datasets section on the left pane.
- Provide a Dataset name, and select where the data comes from (PDF Data Connector).
- Click on Save to save your configuration, and wait for the job to complete.
See an example of the Arxiv datasets below. Notice that you have a column with the text
element of the PDF files. (The Convert PDF to Text Transform worked automatically).
Create your App in Mantium
Quick Warning
- If you select the Standard option and have previously created a split_content column, ensure to pick this same split_content column in subsequent steps rather than the original text column. This will prevent the unnecessary expansion of your dataset, ultimately keeping your OpenAI usage costs in check.
- Ensure to select the Advanced option if you have Embeddings already.
Instructions
Please follow the link below to find instructions on how to create your Mantium apps.
Interact with your App in ChatGPT.
There are two ways to interact with your app in ChatGPT;
- Use Mantium's ChatGPT Plugin to Interact with your App.(Recommended)
- Setup your own OpenAI ChatGPT Plugin, if you have developer access - which means you have the ability to create plugins in ChatGPT.
Use Mantium's ChatGPT Plugin
Please follow the link below to find instructions on how to setup the Mantium official plugin.
Setup your Own OpenAI Plugin
Please follow the link below to find instructions on how to setup your own plugin.
Chat with your PDFs
Now, let's interact the app. Feel free to copy the prompt examples below.
Prompt 1 - Ask questions
Using the PDFPlugin plugin. What is the role of Role of Data-Augmentation in A
Cook’s Guide to Successful SSL Training and Deployment ?
Result in ChatGPT
Prompt 2 - Generate summaries
Using the PDFPlugin plugin, summarize the content on the Publicly Available Model
Checkpoints or APIs in the Survey of Large Language Models.
Result in ChatGPT
Prompt 3 - Generate YouTube scripts using your own document as a reference.
Using the PDFPlugin plugin, summarize the content on the Publicly Available Model Checkpoints or APIs
in the Survey of Large Language Models in a successful Youtuber's script style.
Highlight the important technical information for the technical audience
Result in ChatGPT
Notice the highlighted text in the image; it confirms that the script was generated using the document "Survey of Large Language Models" as a reference.
Conclusion
In this guide, we successfully tackled the challenge of accessing and querying information from academic papers using Large Language Models (LLMs). By integrating OpenAI plugins with the Mantium plugin wizard, we created a seamless workflow to import PDFs, extract text, and interact with the data using ChatGPT. As a result, we unlocked the ability to answer questions, generate summaries, and create content directly from the papers, providing valuable insights and enhancing our work with AI-powered capabilities.
Updated over 1 year ago