ChatGPT Prompts for Data Visualization: Turn Pandas Dataframes into Beautiful Charts
Data visualization is a vital aspect of data analysis. It helps in understanding data, identifying patterns and trends, and communicating insights effectively. Python is one of the most popular languages for data analysis, and the Pandas library is a popular tool for manipulating and analyzing data in Python. In this article, we will explore how ChatGPT prompts can be used to turn Pandas dataframes into beautiful charts. We'll also explore using an Open Source Python library named PyGWalker (opens in a new tab) for Data Visualization.
ChatGPT is a large language model trained by OpenAI that can generate human-like text. It can be used to generate prompts for data analysis, such as generating code to plot charts. The prompts are designed to be easy to use and can save time and effort for data scientists.
We can use ChatGPT prompts to plot charts using the Matplotlib library. Matplotlib is a popular visualization library in Python that provides a wide range of chart types and customization options. Here's an example of using ChatGPT prompts to plot a line chart:
Suppose we have a dataset called "sales_data.csv" with columns "date" and "sales". We want to plot a line chart that shows the trend of sales over time. We can use the following ChatGPT prompt:
Prompt: I want you to act as a coder in python. I have a dataset ["sales_data.csv"] with columns ["date" and "sales"]. [Plot a line chart that shows the trend of sales over time]
Where you can change the words within "" for your specific scenario. ChatGPT generates the following code:
# Import necessary libraries import pandas as pd import matplotlib.pyplot as plt # Read the data df = pd.read_csv('sales_data.csv') # Plot a line chart plt.plot(df['date'], df['sales']) plt.title('Sales Trend') plt.xlabel('Date') plt.ylabel('Sales') plt.show()
This code generates a line chart that shows the trend of sales over time. We can use ChatGPT prompts to generate code for other chart types as well, such as bar charts, scatter plots, and histograms.
In addition to plotting charts from dataframes, we can also use Matplotlib to display images. Suppose we have a folder of images named "image_folder". Each image is named with its corresponding label, such as "cat_01.jpg", "dog_02.jpg", etc. We want to display a grid of images with their labels. Use the following sample prompt:
Prompt: I want you to act as a coder. I have a folder of images. [Describe how files are organized in the directory] [Describe how you want images to be printed]
ChatGPT generates the following Python code:
# Import necessary libraries import matplotlib.pyplot as plt import os from PIL import Image # Set the directory path directory = "image_folder/" # Create a list of image filenames and labels filenames = os.listdir(directory) labels = [filename.split("_") for filename in filenames] # Create a grid of images fig, axs = plt.subplots(nrows=3, ncols=3, figsize=(10,10)) for i, ax in enumerate(axs.flatten()): # Load the image and label img = Image.open(directory + filenames[i]) label = labels[i] # Display the image and label ax.imshow(img) ax.set_title(label) ax.axis('off') plt.show()
This code generates a grid of images with their corresponding labels. We can use ChatGPT prompts to generate code for other image display styles, such as individual images with captions or side-by-side image comparisons.
Sometimes we want to explain the output of a machine-learning model. LIME is a popular library for explaining machine learning models. Suppose we have trained a model using the scikit-learn library. We want to explain the model's output using LIME. Use the following prompt:
Prompt: I want you to act as a data scientist and explain the model's results. I have trained a [library name] model and I would like to explain the output using LIME. Please write the code.
import lime import lime.lime_tabular from sklearn.datasets import load_iris from sklearn.ensemble import RandomForestClassifier # Load the iris dataset iris = load_iris() # Train a random forest classifier clf = RandomForestClassifier(n_estimators=500) clf.fit(iris.data, iris.target) # Define the explainer explainer = lime.lime_tabular.LimeTabularExplainer( iris.data, feature_names=iris.feature_names, class_names=iris.target_names, discretize_continuous=True ) # Explain the first data point exp = explainer.explain_instance(iris.data, clf.predict_proba, num_features=4) exp.show_in_notebook(show_table=True)
This code will generate an explanation for the first data point in the iris dataset, which has four features: sepal length, sepal width, petal length, and petal width. The explanation will show which features are most important for the classifier's prediction.
SHAP (SHapley Additive exPlanations) is another popular library for explaining machine learning models. It provides global interpretability by showing the contribution of each feature to the model's output across the entire dataset.
Use the following prompt for ChatGPT:
Prompt: I want you to act as a data scientist and explain the model's results. I have trained a scikit-learn XGBoost model and I would like to explain the output using a series of plots with Shap. Please write the code.
And this is the output from ChatGPT:
import shap from sklearn.datasets import load_iris from xgboost import XGBClassifier # Load the iris dataset iris = load_iris() # Train an XGBoost classifier clf = XGBClassifier(n_estimators=500) clf.fit(iris.data, iris.target) # Define the explainer explainer = shap.Explainer(clf) # Explain the entire dataset shap_values = explainer(iris.data) # Generate a summary plot shap.summary_plot(shap_values, iris.data) # Generate a force plot for the first data point shap.initjs() shap.force_plot(explainer.expected_value, shap_values, iris.data)
This code will generate a summary plot that shows the most important features of the model's output across the entire dataset. It will also generate a forced plot for the first data point in the dataset, which shows how each feature contributes to the model's prediction for that data point.
Struggling with overwhelming data and clunky BI tools can be a nightmare. But with RATH, you can say goodbye to the chaos and hello to effortless data analysis.
RATH (opens in a new tab) integrates ChatGPT into your data analysis workflow, acting as your 24/7 personal data analyst, streamlining your workflow and boosting your productivity. Get instant insights and stunning visualizations without the hassle.
The workflow is stunningly simple:
- Connect Your Data Source to RATH
- Ask Any Question
- You can get instant Data Insights and Visualizations within seconds.
Everything is done with natural language, with no code required. Check out this awesome Demo about investigating the relationship between Bitcoin price and Gold price in history, by simply talking to RATH:
You can see how RATH easily extracts data from multiple sources and uses natural language to help you explore and understand your data.
And say goodbye to data processing headaches!
Small teams often struggle with SQL queries and data processing, especially without a dedicated data analyst or technical skills. That's where RATH comes in to save the day.
RATH makes it easy for small teams to handle data processing using simple everyday language. Any team member can ask RATH for the information they need, and they'll quickly get useful insights and visualizations. This way, teams can focus on making the most of their data instead of struggling to get it.
RATH supports a wide range of data sources that does not disturb your existing workflow. Here are some of the major database solutions that you can connect to RATH:
We are about to launch the support for AirTable Integration. You can easily visualize your AirTable data with Natural Languages! Simply connect RATH to your AirTable data, and watch the magic happen:
Interested? Inspired? Unlock the insights of your data with one prompt: ChatGPT-powered RATH is Open for Beta Stage now! Get onboard and check it out!
In conclusion, we have explored various powerful data visualization techniques using ChatGPT's prompts and PyGWalker. From plotting data with Matplotlib to explaining machine learning models with Lime and Shap, PyGWalker simplifies the data visualization process. Try out PyGWalker and take advantage of ChatGPT's prompts for effortless data visualization.