When working with Answers, Search, and Classifications Files are used to upload documents that can be used across the specialized endpoints. Any file uploaded to OpenAI must be in the JSON Lines format.

Each line in a JSONL file is a “document” and each document needs to be a single line of valid, UTF-8 encoded JSON. You can use a JSON validator to ensure that a document is in the proper JSON format. An example of a file with two documents is shown below.

{'text': 'Our API provides a general-purpose "text in, text out" interface, which makes it possible to apply it to virtually any language task.'}
{'text': 'A good rule of thumb for using the API is thinking about how you would write out a word problem for a middle schooler to solve.'}

An alternative to using a JSONL file in the Answers and Search endpoints is using the “documents” parameter, which is a list of strings of up to 200 documents to search over, where the maximum document length (in tokens) is 2,034 minus the number of tokens in the query.

For example, given the below documents value, a Search query for “the president” would return the document of “White House” as the most semantically related document.

[“White House”, “Hospital”, “School”]

Which one should you use?

Using the documents parameter is ideal when doing testing or small-scale projects, such as querying a very small FAQ with the Answers endpoint or using Search on a very small number of documents. This is mainly because the documents parameter is limited to just 200 documents.

On the other hand, the file parameter is best suited for larger projects, such as using the Answers endpoint to query answers from a large knowledge base. The file parameter is also better suited for “Going Live,” as users may find it easier to maintain and collaborate using a file than using the documents parameter.

Did this answer your question?