How to send longer text inputs to ChatGPT API?

Question

We have a use case for ChatGPT in summarizing long pieces of text (speech-to-text conversations which can be over an hour).

However we find that the 4k token limit tends to lead to a truncation of the input text to say half or so due to the token limit.

Processing in parts does not seem to retain history of previous parts.

What options do we have for submitting a longer request which is over 4k tokens?

You could tell it that you are going to send multiple pieces of text and it does not need to respond until you explicitly ask for a response. Then send your text in chunks. — Barish Namazov, Commented Mar 6, 2023 at 6:31
This person is asking about the ChatGPT API, not the public ChatGPT web interface. The web version does some sort of smart summarization/compression of any previous conversation once it gets too long, as it must still have the same limits under the hood. For the API you're on your own. Each request to the API is totally separate and retains no "memory" of the previous API calls. I too am wondering what the best solution here is. — jameslol, Commented Mar 8, 2023 at 4:38
OP perhaps you could attempt to implement whatever it is that the web ChatGPT is doing - split all your text into chunks of about 3000 words, get a summary of each in separate API calls, and then send all the summaries in another API call, to get a "summary of summaries" — jameslol, Commented Mar 8, 2023 at 4:43
@jameslol It is not always possible to correctly break the text into pieces — Victor Kushnerov, Commented Mar 13, 2023 at 7:54
@BarishNamazov Could you provide an example of a prompt on how to send a long text to ChatGPT? — Victor Kushnerov, Commented Mar 13, 2023 at 7:58

Aklelka · Accepted Answer · 2023-03-23 06:14:37Z

The closest answer to your question would be in the form of Embeddings.

You can find an overview of what they are here.

I recommend you review this code from the OpenAI Cookbook Github page that used a Web Crawl Q&A example to explain embeddings.

I used the code from Step 5 onwards and altered the location of the text to poin it to my file containing the long piece of text.

From:

# Open the file and read the text
with open("text/" + domain + "/" + file, "r", encoding="UTF-8") as f:
    text = f.read()

to:

# Open the file and read the text
with open("/my_location/long_text_file.txt", "r", encoding="UTF-8") as f:
    text = f.read()

And modified the questions at Step 13 to what I needed to know about the text.

The link to "this code" is broken.
– Ugur
Commented Feb 27 at 9:38 — Ugur, Commented Feb 27 at 9:38

RonanOD · Accepted Answer · 2023-04-29 16:18:48Z

4

Another option is the ChatGPT retrieval plugin. This allows for creation of a vector database of your document's text which can be then processed by the LLM. See https://github.com/openai/chatgpt-retrieval-plugin

answered Apr 29, 2023 at 16:18

RonanOD

8851 gold badge10 silver badges19 bronze badges

Add a comment |

Chiyu Song · Accepted Answer · 2023-04-27 12:41:59Z

0

One approach to handle long text is to divide it into smaller fragments, retrieve the appropriate pieces according to your task, and then send them through an API call.

Here's a project that is capable of processing PDFs, txt and doc files, as well as web pages. It allows you to converse with the document. In your case, you could ask a general question like "what is the document about" to receive a summary, and then inquire for more specific details.

answered Apr 27, 2023 at 12:41

Chiyu Song

1

Add a comment |

halfer · Accepted Answer · 2024-01-23 00:23:58Z

I use long inputs, so I made a tool for myself. You can find it on my GitHub. I built for myself and it serves me well. https://github.com/LearnFL/proj-python-chat-gpt-interface

The page explains how to use the script.

You specify how you want to split prompt by providing the length of desired input length expressed in tokens. Variable task holds instruction for the Chat Gpt on what you want from it. It will be pre-appended to each batch of text so the model could extract or do what you need done.

prompt = """A VERY LONG TEXT ON HOW TO USE          REGULAR EXPRESSIONS..."""
res = OpenAIAPI.generate(
 prompt, task="Explain how to use re",    get='batches', method="chat", model="gpt-    3.5-   turbo-1106", token_size=4000)
print(res)

Masoud Gheisari · Accepted Answer · 2023-03-27 09:50:10Z

-2

You can use GPT-4 for long contexts

As stated in OpenAI:

GPT-4 is capable of handling over 25,000 words of text, allowing for use cases like long form content creation, extended conversations, and document search and analysis.

answered Mar 27, 2023 at 9:50

Masoud Gheisari

1,4571 gold badge14 silver badges23 bronze badges

1

Please add a link that helps solve the problem. Thank you!
– myhd
Commented Aug 10, 2023 at 9:27

Add a comment |

Collectives™ on Stack Overflow

How to send longer text inputs to ChatGPT API?

5 Answers 5

Not the answer you're looking for? Browse other questions tagged
openai-api
chatgpt-api
or ask your own question.

Hot Network Questions

Collectives™ on Stack Overflow

5 Answers 5

Not the answer you're looking for? Browse other questions tagged openai-apichatgpt-api or ask your own question.

Related

Not the answer you're looking for? Browse other questions tagged
openai-api
chatgpt-api
or ask your own question.