Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 6 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,22 +14,24 @@

Transform data and create rich visualizations iteratively with AI 🪄. Try Data Formulator now!

[![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/microsoft/data-formulator?quickstart=1)
Any questions? Ask on the Discord channel! [![Discord](https://img.shields.io/badge/discord-chat-green?logo=discord)](https://discord.gg/mYCZMQKYZb)

<!-- [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/microsoft/data-formulator?quickstart=1) -->

<kbd>
<a target="_blank" rel="noopener noreferrer" href="https://codespaces.new/microsoft/data-formulator?quickstart=1" title="open Data Formulator in GitHub Codespaces"><img src="public/data-formulator-screenshot.png"></a>
</kbd>



## News 🔥🔥🔥

- [05-13-2025] Data Formulator 0.2.1: External Data Loader
- [05-13-2025] Data Formulator 0.2.3: External Data Loader
- We introduced external data loader class to make import data easier. [Readme](https://github.com/microsoft/data-formulator/tree/main/py-src/data_formulator/data_loader) and [Demo](https://github.com/microsoft/data-formulator/pull/155)
- Example data loaders from MySQL and Azure Data Explorer (Kusto) are provided.
- Current data loaders: MySQL, Azure Data Explorer (Kusto), Azure Blob and Amazon S3 (json, parquet, csv).
- Call for action [link](https://github.com/microsoft/data-formulator/issues/156):
- Users: let us know which data source you'd like to load data from.
- Developers: let's build more data loaders.
- Discord channel for discussions: join us! [![Discord](https://img.shields.io/badge/discord-chat-green?logo=discord)](https://discord.gg/mYCZMQKYZb)

- [04-23-2025] Data Formulator 0.2: working with *large* data 📦📦📦
- Explore large data by:
Expand Down Expand Up @@ -68,8 +70,6 @@ Transform data and create rich visualizations iteratively with AI 🪄. Try Data

- [10-01-2024] Initial release of Data Formulator, check out our [[blog]](https://www.microsoft.com/en-us/research/blog/data-formulator-exploring-how-ai-can-help-analysts-create-rich-data-visualizations/) and [[video]](https://youtu.be/3ndlwt0Wi3c)!



## Overview

**Data Formulator** is an application from Microsoft Research that uses large language models to transform data, expediting the practice of data visualization.
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,6 @@
"globals": "^15.12.0",
"sass": "^1.77.6",
"typescript-eslint": "^8.16.0",
"vite": "^5.4.15"
"vite": "^5.4.19"
}
}
3 changes: 1 addition & 2 deletions py-src/data_formulator/agents/agent_code_explanation.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

import pandas as pd
from data_formulator.agents.agent_utils import generate_data_summary, extract_code_from_gpt_response
from data_formulator.agents.agent_utils import generate_data_summary

import logging

Expand Down
2 changes: 0 additions & 2 deletions py-src/data_formulator/agents/agent_py_concept_derive.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

import json
import time

from data_formulator.agents.agent_utils import generate_data_summary, extract_code_from_gpt_response
Expand All @@ -10,7 +9,6 @@
import traceback

import logging
import datetime

logger = logging.getLogger(__name__)

Expand Down
1 change: 0 additions & 1 deletion py-src/data_formulator/agents/agent_py_data_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@
# Licensed under the MIT License.

import json
import sys

from data_formulator.agents.agent_utils import extract_json_objects, generate_data_summary, extract_code_from_gpt_response
import data_formulator.py_sandbox as py_sandbox
Expand Down
3 changes: 1 addition & 2 deletions py-src/data_formulator/agents/agent_query_completion.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,9 @@
# Copyright (c) Microsoft Corporation.
# Licensed under the MIT License.

import pandas as pd
import json

from data_formulator.agents.agent_utils import extract_code_from_gpt_response, extract_json_objects
from data_formulator.agents.agent_utils import extract_json_objects
import re
import logging

Expand Down
4 changes: 0 additions & 4 deletions py-src/data_formulator/agents/agent_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,6 @@
import pandas as pd
import numpy as np

import base64

from pprint import pprint

import re

def string_to_py_varname(var_str):
Expand Down
1 change: 0 additions & 1 deletion py-src/data_formulator/agents/client_utils.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,3 @@
import os
import litellm
import openai
from azure.identity import DefaultAzureCredential, get_bearer_token_provider
Expand Down
8 changes: 6 additions & 2 deletions py-src/data_formulator/data_loader/__init__.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,14 @@
from data_formulator.data_loader.external_data_loader import ExternalDataLoader
from data_formulator.data_loader.mysql_data_loader import MySQLDataLoader
from data_formulator.data_loader.kusto_data_loader import KustoDataLoader
from data_formulator.data_loader.s3_data_loader import S3DataLoader
from data_formulator.data_loader.azure_blob_data_loader import AzureBlobDataLoader

DATA_LOADERS = {
"mysql": MySQLDataLoader,
"kusto": KustoDataLoader
"kusto": KustoDataLoader,
"s3": S3DataLoader,
"azure_blob": AzureBlobDataLoader,
}

__all__ = ["ExternalDataLoader", "MySQLDataLoader", "KustoDataLoader", "DATA_LOADERS"]
__all__ = ["ExternalDataLoader", "MySQLDataLoader", "KustoDataLoader", "S3DataLoader", "AzureBlobDataLoader", "DATA_LOADERS"]
Loading