Skip to content

Conversation

@Chenglong-MS
Copy link
Collaborator

Improves front-end performance, data loader utility, and prepares functionality for the next updates.

nguyenphongmicrosoft and others added 23 commits June 2, 2025 10:30
…emove-redundant-code

remove redundant code
Added PostgreSQL dataloade
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…tables-and-fix-datetime-format-in-kusto

[Kusto] add filter tables and fix datetime format
@Chenglong-MS Chenglong-MS requested a review from Copilot July 1, 2025 20:32
try:
df = pd.DataFrame(json.loads(raw_data))
except Exception as e:
return jsonify({"status": "error", "message": f"Invalid JSON data: {str(e)}, it must be in the format of a list of dictionaries"}), 400

Check warning

Code scanning / CodeQL

Information exposure through an exception Medium

Stack trace information
flows to this location and may be exposed to an external user.

Copilot Autofix

AI 6 months ago

To fix the issue, we will replace the direct inclusion of the exception message (str(e)) in the response with a generic error message. The detailed exception message will be logged on the server for debugging purposes. This ensures that sensitive information is not exposed to the user while still allowing developers to diagnose issues using the logs.

Specifically:

  1. Replace the direct use of str(e) in the response with a generic message like "Invalid JSON data provided."
  2. Log the detailed exception message (str(e)) on the server using the existing logger.
Suggested changeset 1
py-src/data_formulator/tables_routes.py

Autofix patch

Autofix patch
Run the following command in your local git repository to apply this patch
cat << 'EOF' | git apply
diff --git a/py-src/data_formulator/tables_routes.py b/py-src/data_formulator/tables_routes.py
--- a/py-src/data_formulator/tables_routes.py
+++ b/py-src/data_formulator/tables_routes.py
@@ -318,3 +318,4 @@
             except Exception as e:
-                return jsonify({"status": "error", "message": f"Invalid JSON data: {str(e)}, it must be in the format of a list of dictionaries"}), 400
+                logger.error(f"Invalid JSON data: {str(e)}")
+                return jsonify({"status": "error", "message": "Invalid JSON data provided. It must be in the format of a list of dictionaries."}), 400
 
EOF
@@ -318,3 +318,4 @@
except Exception as e:
return jsonify({"status": "error", "message": f"Invalid JSON data: {str(e)}, it must be in the format of a list of dictionaries"}), 400
logger.error(f"Invalid JSON data: {str(e)}")
return jsonify({"status": "error", "message": "Invalid JSON data provided. It must be in the format of a list of dictionaries."}), 400

Copilot is powered by AI and may make mistakes. Always verify output.
Unable to commit as this autofix suggestion is now outdated
@Chenglong-MS
Copy link
Collaborator Author

This pull request introduces several updates across multiple files to enhance functionality, improve logging, and refine the development experience. Key changes include updates to data loaders, improved logging mechanisms, and modifications to the application startup process.

Data Loader Enhancements:

  • Added support for PostgreSQL and MSSQL data loaders in DATA_LOADERS and updated __all__ in py-src/data_formulator/data_loader/__init__.py. (F1c964f8R1)
  • Introduced optional table_filter parameter to the list_tables method across multiple data loaders (Azure Blob, Kusto, MySQL) to filter tables by name. [1] [2] [3]
  • Enhanced Kusto data loader with a _convert_kusto_datetime_columns method for improved handling of datetime columns and added logging for query execution and data ingestion. [1] [2]

Logging Improvements:

  • Added detailed logging for data ingestion, including DataFrame schema and sample values for datetime columns, in external_data_loader.py. [1] [2]
  • Configured a root logger in kusto_data_loader.py for consistent logging across the module.

Application Startup Modifications:

  • Updated local_server.bat and local_server.sh to use python -m for starting the app, replacing the Flask CLI. Added a --dev flag for development mode. [1] [2]
  • Added --dev argument to app.py to enable development mode, which prevents automatic browser opening and enables debug mode with auto-reload. [1] [2]

Dependency Updates:

  • Updated @mui/icons-material and @mui/material dependencies in package.json to the latest versions.

Additional Changes:

  • Registered a new sse_bp blueprint in app.py for server-sent events.
  • Removed unused token variable in agent_routes.py for cleaner code.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances front-end performance via memoization and selector optimizations, integrates SSE for real-time updates, and improves data loader and UI components.

  • Refactor chart rendering with dfSelectors.getAllCharts, useMemo, and memo to reduce unnecessary re-renders
  • Add SSE client/server support for real-time data formulation updates
  • Update dialogs, grids, and snackbars with new slotProps, cleaner layouts, and raw data syncing

Reviewed Changes

Copilot reviewed 40 out of 42 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
src/views/VisualizationView.tsx Refactored chart skeleton utility and menu layout; switched to memoized selectors
src/views/MessageSnackbar.tsx Grouped system messages, added expand/collapse and count UI
src/views/DataThread.tsx Memoized chart elements and integrated pending SSE status
src/app/dfSlice.tsx Added SSE message handling state and selector utilities
py-src/data_formulator/tables_routes.py Extended table creation endpoint for raw_data and sanitization
Comments suppressed due to low confidence (2)

src/views/DataThread.tsx:728

  • _.isEqual is used without importing lodash. Add import _ from 'lodash'; or replace with another deep-equality check.
        _.isEqual(prevProps.chart.encodingMap, nextProps.chart.encodingMap) &&

py-src/data_formulator/tables_routes.py:71

  • You reference pd.DataFrame() but pandas (pd) is not imported at the top of this file. Add import pandas as pd.
                    sample_rows = db.execute(f"SELECT * FROM {table_name} LIMIT 1000").fetchdf() if row_count > 0 else pd.DataFrame()

@Chenglong-MS
Copy link
Collaborator Author

Huge thanks to all contributors to this PR!
@nguyenphongmicrosoft @jodur @RishabhJainBM @nvtphong200401

@Chenglong-MS Chenglong-MS changed the title 0.2.4 - Dev: performance update & data loader improvement [deploy] 0.2.4 - Dev: performance update & data loader improvement Jul 1, 2025
@Chenglong-MS Chenglong-MS merged commit 8973435 into main Jul 1, 2025
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants