Skip to content

Conversation

@axelwalter
Copy link
Collaborator

@axelwalter axelwalter commented Jul 10, 2024

User description

A bunch of updates to the template app, the major ones:

  • new interactive visualization with reactive peak map, 3D peak map, spectrum viewer, chromatogram viewer
  • simplified TOPP excluding SiriusExport workflow example, added example data, new result visualization
  • use new streamlit fragments where possible
  • rework pages with new st.navigation --> restructured sidebar
  • documentation all in one page
  • completely new landing page with extensive Quickstart guide, including links to the example pages

The visualization needs to be updated after pyopenms-viz has been finished.


PR Type

Enhancement, Documentation


Description

  • Refactored the main page to use st.navigation for better page organization.
  • Consolidated various documentation pages into a single navigable page.
  • Added new pages for Quickstart, raw data viewer, and TOPP Workflow Framework.
  • Enhanced the Workflow class with new methods and modular fragments.
  • Added captcha control and enhanced the show_fig function.
  • Introduced new functions and classes for viewing and plotting MS data.
  • Updated environment and requirements dependencies.

Changes walkthrough 📝

Relevant files
Enhancement
9 files
app.py
Refactor main page to use st.navigation for page organization

app.py

  • Replaced the main page content with a navigation structure using
    st.navigation.
  • Organized pages into categories such as "OpenMS Web App", "TOPP
    Workflow Framework", "pyOpenMS Workflow", and "Others Topics".
  • +22/-85 
    raw_data_viewer.py
    Add raw data viewer page with interactive plots                   

    pages/raw_data_viewer.py

  • Added a new page for viewing raw MS data with interactive plots for
    peak maps, spectra, and chromatograms.
  • +32/-0   
    topp_workflow.py
    Add TOPP Workflow Framework page with multiple sections   

    pages/topp_workflow.py

  • Added a new page for the TOPP Workflow Framework with sections for
    file upload, parameter configuration, execution, and results.
  • +24/-0   
    Workflow.py
    Enhance Workflow class with new methods and modular fragments

    src/Workflow.py

  • Updated the Workflow class to include new methods for uploading files,
    configuring parameters, executing workflows, and displaying results.
  • Added experimental fragments for better modularity.
  • +75/-33 
    common.py
    Add captcha control and enhance show_fig function               

    src/common.py

  • Added captcha control to the page setup function.
  • Updated the show_fig function to support selection and zooming.
  • +60/-27 
    view.py
    Add functions for viewing MS data and enhance data processing

    src/view.py

  • Added new functions for viewing peak maps, spectra, and chromatograms.
  • Enhanced data loading and processing for MS experiments.
  • +214/-115
    BasePlotter.py
    Introduce base plotter class for plotting configurations 

    src/plotting/BasePlotter.py

  • Introduced a base plotter class with configuration settings and
    abstract methods for plotting.
  • +58/-0   
    MSExperimentPlotter.py
    Add MSExperimentPlotter class for 2D and 3D plotting         

    src/plotting/MSExperimentPlotter.py

  • Added a new plotter class for MS experiments with methods for 2D and
    3D plotting.
  • +214/-0 
    export_consensus_feature_df.py
    Add Python tool for exporting consensus feature data         

    src/python-tools/export_consensus_feature_df.py

  • Added a new Python tool for exporting consensus feature data to a
    DataFrame.
  • +46/-0   
    Dependencies
    3 files
    hook-streamlit.py
    Remove streamlit_plotly_events from metadata copy in hook

    hooks/hook-streamlit.py

    • Removed streamlit_plotly_events from metadata copy.
    +0/-1     
    environment.yml
    Update environment dependencies and Python version             

    environment.yml

  • Updated Python version to 3.11.
  • Updated dependencies for streamlit, plotly, and numpy.
  • +5/-8     
    requirements.txt
    Update requirements for streamlit, plotly, and numpy         

    requirements.txt

    • Updated dependencies for streamlit, plotly, and numpy.
    +3/-6     
    Documentation
    2 files
    documentation.py
    Consolidate documentation into a single navigable page     

    pages/documentation.py

  • Consolidated various documentation pages into a single page with a
    selectbox for navigation.
  • Included sections for User Guide, Installation, Developer Guides, and
    Deployment.
  • +594/-0 
    quickstart.py
    Add Quickstart page with overview and links                           

    pages/quickstart.py

  • Created a new Quickstart page with an overview of features,
    documentation links, and example pages.
  • +165/-0 

    💡 PR-Agent usage:
    Comment /help on the PR to get a list of all available PR-Agent tools and their descriptions

    axelwalter added 14 commits July 3, 2024 11:51
    - remove plotly-events and aggrid
    - 2D peak map updating while zooming in
    - 3D peak map static plot on zoom selection of 2D plot
    - spectrum viewer, select spectrum from table, update top 5 m/z annotations zooming in
    - chromatogram viewer with BPC, TIC and XIC
    - every section in separate fragments
    - using the new st.navigation
    - creates sections in sidebar
    - icons don't need to be in page name
    - icon on top (can be changed to 24*240px banner later)
    - workspace selector on every page in collapsed expander
    - simple workflow with feature detection and linking
    - export consensus map to dataframe with python script
    - added interactive result section
    @qodo-code-review qodo-code-review bot added documentation Improvements or additions to documentation enhancement New feature or request Review effort [1-5]: 4 labels Jul 10, 2024
    @qodo-code-review
    Copy link

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Reviewer Guide 🔍

    ⏱️ Estimated effort to review: 4 🔵🔵🔵🔵⚪
    🧪 No relevant tests
    🔒 No security concerns identified
    ⚡ Key issues to review

    Possible Bug:
    The refactoring in app.py removed a lot of initial setup and explanatory comments which might affect the understanding and maintainability of the code.

    Performance Concern:
    The new implementation in app.py uses a dictionary to manage pages, which might not be as efficient as the previous method if the number of pages grows significantly.

    Code Clarity:
    The new pages dictionary in app.py could benefit from additional comments explaining the structure and purpose of each page, especially since this structure is central to the application's navigation.

    @qodo-code-review
    Copy link

    qodo-code-review bot commented Jul 10, 2024

    PR-Agent was enabled for this repository. To continue using it, please link your git user with your CodiumAI identity here.

    PR Code Suggestions ✨

    CategorySuggestion                                                                                                                                    Score
    Possible bug
    Add error handling for string to float conversion to prevent application crashes

    Add error handling for potential exceptions when converting 'target_value' to float
    in the 'plot_bpc_tic' function.

    src/view.py [94]

    -target_value = float(target_value)
    +try:
    +    target_value = float(target_value)
    +except ValueError:
    +    st.error("Invalid m/z value provided. Please enter a valid number.")
    +    return
     
    • Apply this suggestion
    Suggestion importance[1-10]: 10

    Why: Adding error handling for the float conversion prevents application crashes and provides user feedback, which is essential for a robust application.

    10
    Prevent potential KeyError by checking for column existence before accessing

    Replace the direct access to the 'mzarray' and 'intarray' columns with a method that
    checks for their existence to prevent potential KeyError exceptions.

    src/view.py [40]

     df_spectra["max intensity m/z"] = df_spectra.apply(
    -    lambda x: x["mzarray"][x["intarray"].argmax()], axis=1
    +    lambda x: x["mzarray"][x["intarray"].argmax()] if "mzarray" in x and "intarray" in x else np.nan, axis=1
     )
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: This suggestion prevents potential KeyError exceptions by ensuring the columns exist before accessing them, which is a crucial improvement for robustness.

    9
    Add error handling for invalid input in the grayscale color generation method

    Add error handling for the case where n is less than 1 in _get_n_grayscale_colors
    method to prevent runtime errors during the calculation of grayscale colors.

    src/plotting/BasePlotter.py [43]

     def _get_n_grayscale_colors(self, n: int) -> List[str]:
    +    if n < 1:
    +        raise ValueError("Number of colors 'n' must be at least 1")
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: Adding error handling for invalid input is crucial to prevent runtime errors and ensure robustness, making this a highly valuable improvement.

    9
    Best practice
    ✅ Ensure file is properly closed after reading its contents
    Suggestion Impact:The suggestion to use a context manager for reading the file was implemented. The file is read within a `with` statement, ensuring it is properly closed after reading its contents.

    code diff:

    -        with open("OpenMS-App.zip", "rb") as file:
    -            st.download_button(
    -                label="Download for Windows",
    -                data=file,
    -                file_name="OpenMS-App.zip",
    -                mime="archive/zip",
    -                type="primary",
    -            )

    Use a context manager for handling file operations to ensure that the file is
    properly closed after its contents are processed, which will prevent potential file
    locking or resource leakage issues.

    pages/documentation.py [95-102]

     with open("OpenMS-App.zip", "rb") as file:
    -    st.download_button(
    -        label="Download for Windows",
    -        data=file,
    -        file_name="OpenMS-App.zip",
    -        mime="archive/zip",
    -        type="primary",
    -    )
    +    file_content = file.read()
    +st.download_button(
    +    label="Download for Windows",
    +    data=file_content,
    +    file_name="OpenMS-App.zip",
    +    mime="archive/zip",
    +    type="primary",
    +)
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: Using a context manager to read the file content before passing it to st.download_button ensures that the file is properly closed, which is a best practice to prevent potential file locking or resource leakage issues.

    9
    Robustness
    Add error handling to improve robustness during file operations

    Add error handling for file operations in the upload method to manage exceptions
    that may occur when the file path does not exist or the file cannot be copied.

    src/Workflow.py [116-117]

    -if not Path(files_dir, f).exists():
    -    shutil.copy(f, Path(files_dir, Path(f).name))
    +try:
    +    if not Path(files_dir, f).exists():
    +        shutil.copy(f, Path(files_dir, Path(f).name))
    +except Exception as e:
    +    st.error(f"Failed to copy file {f}: {str(e)}")
     
    • Apply this suggestion
    Suggestion importance[1-10]: 9

    Why: Adding error handling for file operations is crucial for robustness. This suggestion correctly adds a try-except block to manage potential exceptions, improving the reliability of the code.

    9
    Maintainability
    ✅ Decouple HTTP requests from UI logic for better maintainability
    Suggestion Impact:The suggestion was implemented by creating a separate function fetch_content to handle HTTP requests, which decouples the network logic from the UI code.

    code diff:

    -    def fetch_markdown_content(url):
    -        response = requests.get(url)
    -        if response.status_code == 200:
    -            # Remove the first line from the content
    -            content_lines = response.text.split("\n")
    -            markdown_content = "\n".join(content_lines[1:])
    -            return markdown_content
    -        else:
    -            return None
    -
    -
    -    tabs = ["embeddable Python", "PyInstaller"]
    -    tabs = st.tabs(tabs)
    -
    -    # window executable with embeddable python
    -    with tabs[0]:
    -        markdown_url = "https://raw.githubusercontent.com/OpenMS/streamlit-template/main/win_exe_with_embed_py.md"
    -
    -        markdown_content = fetch_markdown_content(markdown_url)
    -
    -        if markdown_content:
    -            st.markdown(markdown_content, unsafe_allow_html=True)
    -        else:
    -            st.error(
    -                "Failed to fetch Markdown content from the specified URL.", markdown_url
    -            )
    -
    -    # window executable with pyinstaller
    -    with tabs[1]:
    -        # URL of the Markdown document
    -        markdown_url = "https://raw.githubusercontent.com/OpenMS/streamlit-template/main/win_exe_with_pyinstaller.md"
    -
    -        markdown_content = fetch_markdown_content(markdown_url)
    -
    -        if markdown_content:
    -            st.markdown(markdown_content, unsafe_allow_html=True)
    -        else:
    -            st.error(
    -                "Failed to fetch Markdown content from the specified URL. ", markdown_url
    -            )
    -
    -#############################################################################################
    -# Deployment
    -#############################################################################################
    -
    -if page == pages[5]:
    -    url = "https://raw.githubusercontent.com/OpenMS/streamlit-deployment/main/README.md"
    -
    -    response = requests.get(url)
    -
    -    if response.status_code == 200:
    -        st.markdown(response.text)  # or process the content as needed
    -    else:

    Replace the use of requests.get() directly in the Streamlit callbacks with a
    separate function that handles HTTP requests. This will decouple the network logic
    from the UI code and make the application easier to maintain and test.

    pages/documentation.py [589-594]

    -response = requests.get(url)
    -if response.status_code == 200:
    -    st.markdown(response.text)  # or process the content as needed
    +def fetch_content(url):
    +    try:
    +        response = requests.get(url)
    +        response.raise_for_status()
    +        return response.text
    +    except requests.RequestException as e:
    +        return f"Failed to fetch content: {str(e)}"
    +
    +content = fetch_content(url)
    +if content.startswith("Failed to fetch content:"):
    +    st.warning(content)
     else:
    -    st.warning("Failed to get README from streamlit-deployment repository.")
    +    st.markdown(content)
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion separates the HTTP request logic from the UI code, making the code more modular and easier to maintain and test. This is a significant improvement in terms of code organization and maintainability.

    8
    ✅ Refactor repeated Markdown fetching and displaying into a reusable function
    Suggestion Impact:The suggestion to refactor the repeated code for fetching and displaying Markdown content from URLs into a single reusable function was implemented. The function `display_markdown_from_url` was created and used to replace the repeated code.

    code diff:

    -    def fetch_markdown_content(url):
    -        response = requests.get(url)
    -        if response.status_code == 200:
    -            # Remove the first line from the content
    -            content_lines = response.text.split("\n")
    -            markdown_content = "\n".join(content_lines[1:])
    -            return markdown_content
    -        else:
    -            return None
    -
    -
    -    tabs = ["embeddable Python", "PyInstaller"]
    -    tabs = st.tabs(tabs)
    -
    -    # window executable with embeddable python
    -    with tabs[0]:
    -        markdown_url = "https://raw.githubusercontent.com/OpenMS/streamlit-template/main/win_exe_with_embed_py.md"
    -
    -        markdown_content = fetch_markdown_content(markdown_url)
    -
    -        if markdown_content:
    -            st.markdown(markdown_content, unsafe_allow_html=True)
    -        else:
    -            st.error(
    -                "Failed to fetch Markdown content from the specified URL.", markdown_url
    -            )

    Refactor the repeated code for fetching and displaying Markdown content from URLs
    into a single reusable function. This will reduce code duplication and improve
    maintainability.

    pages/documentation.py [559-566]

    -markdown_content = fetch_markdown_content(markdown_url)
    -if markdown_content:
    -    st.markdown(markdown_content, unsafe_allow_html=True)
    -else:
    -    st.error(
    -        "Failed to fetch Markdown content from the specified URL.", markdown_url
    -    )
    +def display_markdown_from_url(url):
    +    markdown_content = fetch_markdown_content(url)
    +    if markdown_content:
    +        st.markdown(markdown_content, unsafe_allow_html=True)
    +    else:
    +        st.error(f"Failed to fetch Markdown content from the specified URL: {url}")
     
    +display_markdown_from_url(markdown_url)
    +
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Refactoring the repeated code into a single reusable function reduces code duplication and improves maintainability. This change enhances the readability and maintainability of the codebase.

    8
    Improve error handling when checking if a file exists

    Replace the direct use of Path.exists() with a more robust error handling approach.
    This will improve the reliability of the code by ensuring that any issues with file
    access or existence are properly managed.

    pages/documentation.py [89]

    -if Path("OpenMS-App.zip").exists():
    +try:
    +    if Path("OpenMS-App.zip").exists():
    +        # Existing code continues here
    +except Exception as e:
    +    st.error(f"Failed to check if file exists: {str(e)}")
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The suggestion improves error handling by wrapping the file existence check in a try-except block, which can help catch and report errors more gracefully. However, the likelihood of an exception occurring in this specific context is relatively low.

    7
    Enhance function reusability by replacing direct session state manipulation with function parameters

    Replace the direct manipulation of session state for image format in show_fig with a
    function parameter to enhance function reusability and reduce side effects.

    src/Workflow.py [309-311]

     "filename": download_name,
    -"format": st.session_state["image-format"],
    +"format": image_format,
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Replacing direct session state manipulation with function parameters improves the reusability and maintainability of the function. This suggestion correctly identifies and addresses this issue.

    7
    Performance
    Use DataFrame.query for more efficient and readable DataFrame filtering

    Use a more efficient method for filtering DataFrame in the 'plot_bpc_tic' function
    to improve performance.

    src/view.py [99]

    -df_eic = df[(df['mz'] >= target_value - tolerance) & (df['mz'] <= target_value + tolerance)]
    +df_eic = df.query('mz >= @target_value - @tolerance and mz <= @target_value + @tolerance')
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: Using DataFrame.query is more efficient and readable, which enhances performance and maintainability of the code.

    8
    Improve memory efficiency by using a generator expression for file paths

    Replace the list comprehension with a generator expression inside the
    self.ui.upload_widget method call to avoid unnecessary list creation. This can
    improve memory efficiency when handling large numbers of files.

    src/Workflow.py [27]

     self.ui.upload_widget(
         key="mzML-files",
         name="MS data",
         file_type="mzML",
    -    fallback=[str(f) for f in Path("example-data", "mzML").glob("*.mzML")],
    +    fallback=(str(f) for f in Path("example-data", "mzML").glob("*.mzML")),
     )
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: The suggestion correctly replaces the list comprehension with a generator expression, which can improve memory efficiency, especially when handling large numbers of files. The change is minor but beneficial.

    8
    Improve performance and readability using list comprehension

    Use a list comprehension instead of a loop for appending values to 'precs' to
    improve performance and readability.

    src/view.py [32-38]

    -precs = []
    -for spec in exp:
    -    p = spec.getPrecursors()
    -    if p:
    -        precs.append(p[0].getMZ())
    -    else:
    -        precs.append(np.nan)
    +precs = [p[0].getMZ() if p else np.nan for p in (spec.getPrecursors() for spec in exp)]
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: The list comprehension improves performance and readability, making the code more concise and efficient.

    7
    Enhancement
    Remove dependency on NumPy for generating a range of values

    Replace the np.linspace function with a pure Python approach to avoid unnecessary
    dependency on NumPy just for generating a range of values. This can be achieved
    using list comprehension and linear interpolation.

    src/plotting/BasePlotter.py [46]

    -for v in np.linspace(50, 200, n):
    +for v in [50 + i * (200 - 50) / (n - 1) for i in range(n)]:
     
    • Apply this suggestion
    Suggestion importance[1-10]: 8

    Why: This suggestion removes an unnecessary dependency on NumPy for a simple task, which can improve performance and reduce the complexity of the codebase.

    8
    Improve the exception message in the configuration update method

    Use a more descriptive exception message in updateConfig method to include both the
    invalid key and the valid keys from the configuration for better debugging and user
    feedback.

    src/plotting/BasePlotter.py [41]

    -raise ValueError(f"Invalid config setting: {key}")
    +valid_keys = ', '.join([field.name for field in dataclasses.fields(self.config)])
    +raise ValueError(f"Invalid config setting: '{key}'. Valid settings are: {valid_keys}")
     
    • Apply this suggestion
    Suggestion importance[1-10]: 7

    Why: Enhancing the exception message improves debugging and user feedback, making it easier to identify and correct configuration issues.

    7
    Possible issue
    Initialize the figure object to a default empty plot

    Initialize self.fig in the _BasePlotter constructor to a default value that
    represents an empty plot, rather than None, to avoid potential issues with methods
    that might access self.fig before it's set.

    src/plotting/BasePlotter.py [34]

    -self.fig = None  # holds the figure object
    +self.fig = plt.figure()  # Initialize with an empty figure
     
    • Apply this suggestion
    Suggestion importance[1-10]: 6

    Why: Initializing self.fig to a default empty plot can prevent potential issues, but the specific implementation (e.g., using plt.figure()) might introduce a dependency on a plotting library that is not currently imported or mentioned in the code.

    6

    - warning to use a pages directory within an app using st.navigation
    - could lead to unwanted behaviour
    - removed fragment from file upload sections, since configure depends on it
    - check if fallback files are present and remove before adding new files
    - files directory created once
    - removed unneccessary mkdirs
    Copy link
    Member

    @t0mdavid-m t0mdavid-m left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    This a great clean up of the template app. Thank you!

    I added some minor comments. I think it would also make sense to merge #71 and update the documentation here, while we are at it.

    axelwalter and others added 3 commits July 24, 2024 07:34
    Co-authored-by: Tom David Müller <57191390+t0mdavid-m@users.noreply.github.com>
    Co-authored-by: Tom David Müller <57191390+t0mdavid-m@users.noreply.github.com>
    @axelwalter
    Copy link
    Collaborator Author

    @t0mdavid-m thanks for the review! Integrated your download section PR with minor changes and updated the documentation structure. Now there is a docs directory containing the contents of each page as markdown file (if possible) to be read independently of the app.

    @axelwalter axelwalter requested a review from t0mdavid-m July 24, 2024 06:08
    Copy link
    Member

    @t0mdavid-m t0mdavid-m left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    Thank you for the changes! I think putting the documentation in a dedicated folder is a great idea. I also like the changes you made to the download section. The sorting of results as "newest first" makes much more sense and I will also incorporate it into FLASHViewer.

    @axelwalter axelwalter merged commit 353e039 into OpenMS:main Jul 24, 2024
    JohannesvKL pushed a commit to JohannesvKL/PTMScanner that referenced this pull request Dec 19, 2024
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

    Labels

    documentation Improvements or additions to documentation enhancement New feature or request Review effort [1-5]: 4

    Projects

    None yet

    Development

    Successfully merging this pull request may close these issues.

    2 participants