How to Chat with your Data using Mantium

In this tutorial, we'll guide you through the process of connecting your data in Mantium to OpenAI's ChatGPT Plugin. By doing so, you'll be able to interact with your data directly within the ChatGPT interface using the Mantium Plugin. Let's explore the steps to achieve this integration.

Introduction

Using OpenAI plugins, you can access up-to-date information to enhance the capabilities of the Large Language Models. By connecting to these extensions, you can integrate with ChatGPT's core system for your own specific use case. With Mantium Plugin Wizard, you can easily build custom plugins for your own specific use case by leveraging Mantium's data pipeline to bring your own data into ChatGPT.

This document will provide a step-by-step guide on how to achieve it. We will focus on creating a dataset in Mantium, setting up plug-ins, and then chatting with your data.

Objective

Our goal is to use the PDF Data Connector in Mantium's platform to import PDFs from selected ArXiv papers, extract the text, set up plugins, and then query the documents to answer questions, generate content, and provide summarization—all within ChatGPT.

Video

We understand that sometimes it's easier to learn by watching rather than reading. If you prefer a more visual explanation, feel free to check out our accompanying video tutorial below. If you prefer reading or are unable to watch the video, please continue with the text documentation.

Prerequisites

Take a few moments to set up the API key for OpenAI.

Download Dataset

To follow along with this tutorial, download the following papers from https://arxiv.org/ .

Import Data from ArXiv Using the PDF Data Connector

  1. Navigate to the Data Sources section by clicking Data Source on the left navigation bar.
  2. Click Add Data Source, and select the PDF Data Connector from the Data Sources list.
  3. Provide the information to label the Data Source, and click Save and Test.
  4. On the next page, the sync job will start automatically. If it doesn't, click on “Manual Sync” at the top right corner, to perform the initial sync.
  5. Wait a few moments for the sync to be completed, and navigate to Files to upload your papers in .pdf format.
  6. Click on the Finish and Sync button to complete the upload process.
  7. At this point, we have successfully imported PDFs of ArXiv papers using the PDF Data Connector.

Create New Dataset

Datasets serve as the central workspace where you can apply transformations and enrichments to data retrieved from various sources, enabling you to modify and analyze the data without impacting the original information.

To create a new dataset:

  1. After the sync is completed, Click on the Create Custom Datasets button in the Data Source section.
  2. Alternatively, you can create datasets by navigating to the Datasets section on the left pane.
  3. Provide a Dataset name, and select where the data comes from (PDF Data Connector).
  4. Click on Save to save your configuration, and wait for the job to complete.

See an example of the Arxiv datasets below. Notice that you have a column with the text element of the PDF files. (The Convert PDF to Text Transform worked automatically).

Create your App in Mantium

🚧

Quick Warning

  • If you select the Standard option and have previously created a split_content column, ensure to pick this same split_content column in subsequent steps rather than the original text column. This will prevent the unnecessary expansion of your dataset, ultimately keeping your OpenAI usage costs in check.
  • Ensure to select the Advanced option if you have Embeddings already.

Instructions

Please follow the link below to find instructions on how to create your Mantium apps.

📘

Setup your Mantium Apps

Interact with your App in ChatGPT.

There are two ways to interact with your app in ChatGPT;

  1. Use Mantium's ChatGPT Plugin to Interact with your App.(Recommended)
  2. Setup your own OpenAI ChatGPT Plugin, if you have developer access - which means you have the ability to create plugins in ChatGPT.

Use Mantium's ChatGPT Plugin

Please follow the link below to find instructions on how to setup the Mantium official plugin.

📘

How to use Mantium's ChatGPT Plugin to Access Deployed Apps

Setup your Own OpenAI Plugin

Please follow the link below to find instructions on how to setup your own plugin.

📘

How to Create your own OpenAI Plugin with Mantium

Chat with your PDFs

Now, let's interact the app. Feel free to copy the prompt examples below.

Prompt 1 - Ask questions

Using the PDFPlugin plugin. What is the role of Role of Data-Augmentation in A 
Cook’s Guide to Successful SSL Training and Deployment ?

Result in ChatGPT

Prompt 2 - Generate summaries

Using the PDFPlugin plugin, summarize the content on the Publicly Available Model
Checkpoints or APIs in the Survey of Large Language Models. 

Result in ChatGPT

Prompt Example 2

Prompt 3 - Generate YouTube scripts using your own document as a reference.

Using the PDFPlugin plugin,  summarize the content on the Publicly Available Model Checkpoints or APIs 
in the Survey of Large Language Models in a successful Youtuber's script style. 
Highlight the important technical information for the technical audience

Result in ChatGPT

Notice the highlighted text in the image; it confirms that the script was generated using the document "Survey of Large Language Models" as a reference.

Prompt example 3

Conclusion

In this guide, we successfully tackled the challenge of accessing and querying information from academic papers using Large Language Models (LLMs). By integrating OpenAI plugins with the Mantium plugin wizard, we created a seamless workflow to import PDFs, extract text, and interact with the data using ChatGPT. As a result, we unlocked the ability to answer questions, generate summaries, and create content directly from the papers, providing valuable insights and enhancing our work with AI-powered capabilities.