Skip to main content
All CollectionsChatGPT
Data analysis with ChatGPT
Data analysis with ChatGPT

Feature and capabilities used when working with data in ChatGPT

Updated over a week ago

What can you do with data in ChatGPT?

When analyzing data with ChatGPT, you can create static and interactive tables and charts from your uploaded data.

  • ChatGPT will automatically create an interactive table view, allowing you to scroll through your data and view all of your rows and columns.

  • After uploading a file, ChatGPT can determine the ideal chart type for the dataset, or you can specify one of our supported chart types in your prompt.

  • You can customize the graphics of your interactive charts and create summaries explaining your findings.

Learn more about extracting insights with ChatGPT Data Analysis.

What file types are supported?

ChatGPT can analyze data uploaded in a variety of file formats, including::

  • Excel (.xls / .xlsx)

  • Comma-separated values (.csv)

  • PDF (.pdf)

  • JSON

When preparing spreadsheets for analysis in ChatGPT, follow these guidelines for best results:

Do:

  • Include descriptive column headers in the first row

  • Use plain language for column headers, avoiding acronyms and jargon

  • Use one row per record

Don’t:

  • Include multiple sections and tables in a single spreadsheet

  • Include empty rows or columns

  • Include images which contain critical information

How does ChatGPT analyze and visualize data with charts?

ChatGPT uses pandas to analyze your data and Matplotlib to create both static and interactive charts with your data. After using ChatGPT to analyze or visualize your data, click on the View Analysis link that appears at the end of the response to see how ChatGPT used these tools:

How can I see the analysis by default?

After using ChatGPT to analyze or visualize your data, click on the View Analysis link that appears at the end of the response.

At the top of the modal, you can toggle “Always show details” on so that the analysis window appears by default after every response.

If you would like to use the code locally, you can click on “Copy code” to copy the code to your clipboard and paste it into your code editor.

How do I enable interactive charts?

After generating a chart, select "Switch to interactive chart" on the top-right corner of the graph.

After selecting this option, the graph will re-render and update to an interactive version of your graph type. Please note that a limited set of chart types are interactive.

You can switch back to a static graph by selecting "Switch to static chart" on the top-right corner of the graph.

What chart types are interactive?

Currently, only bar, pie, scatter, and line charts are currently interactive in most cases.

ChatGPT can produce a variety of non-interactive charts, including: histograms, scatter plot, box plots (Box-and-Whisker Plots), heat maps, area charts, radar charts, treemaps, bubble charts, and waterfall charts.

How many files can I analyze at once?

  • Up to 10 files can be uploaded to a given conversation

  • Up to 20 files can be attached to a GPT as Knowledge (ChatGPT can interact with these files if the Code Interpreter capability is enabled at the GPT level)

How much data can I analyze?

512 MB per file. For CSV files or spreadsheets, the file size cannot exceed approximately 50MB, depending on the size of each row.

This makes ChatGPT a good solution for working with data files which are too large to open in a spreadsheet application.

What’s going on under the hood?

When you upload structured data, ChatGPT starts by examining the first few rows of data to understand the schema and types of values which may exist.

When you ask questions about your data, ChatGPT performs the following steps:

  1. Access the uploaded data in a code execution environment

  2. Write Python code to process the data and produce the required analytical output

  3. Execute code and examine the results

  4. Integrate the results into the response you see in the chat window

It is ChatGPT’s ability to both write and execute code that enables it to perform complex mathematical operations and statistical analysis techniques. If you would like to examine the code which ChatGPT generated, click on the blue [>_] link at the end of a message.

How does ChatGPT know how to analyze data?

One of ChatGPT’s core capabilities is the ability to perform complex analysis based on natural language prompts. In order to make this work, GPT-4 (the frontier model which powers ChatGPT) was post-trained on a large volume of data analysis tasks. After being exposed to example datasets, natural language questions about those datasets, and the code data analysts wrote to answer those questions, the model is now able to generate new code to perform novel analyses. This is why ChatGPT “knows” how to use specialized Python libraries to perform complex tasks.

How does ChatGPT execute code?

When analyzing data, ChatGPT gets access to a secure code execution environment. The environment is pre-loaded with hundreds of Python libraries, and ChatGPT knows how to write code to import and use these libraries. The environment has access to files which are attached to the ChatGPT prompt, which allows it to interact with the structured data you upload. The environment can also access files which are retrieved using GPT Actions.

When ChatGPT generates code in response to your prompt, it passes that code to the environment for execution. It then has access to environment outputs, including any errors produced by the generated code. ChatGPT is able to interpret errors and resolve issues with the generated code automatically.

The ChatGPT code execution environment is unable to generate outbound network requests directly. Code execution is also isolated from the rest of the ChatGPT hosting platform, which ensures the safety of the feature.

When ChatGPT analyzes data for the first time during a conversation, a new instance of the code execution environment is generated. This instance is only accessible from within the associated conversation, and is destroyed within 13 hours of the conversation becoming inactive.

What are some applications outside of data analysis?

ChatGPT’s code execution environment is primarily designed for interacting with structured data. However, the core capabilities of the feature (writing and executing code, accessing the output of code execution) enable a wide variety of applications outside of data analysis.

Applications include:

  • File manipulation and generation

  • Thematic analysis of unstructured data and text documents

  • etc.

ChatGPT is trained on a variety of coding tasks, and can come up with creative ways to use the code execution environment to accomplish tasks.

Did this answer your question?