Skip to content

Explore options for accelerating InMemoryTableScanExec #2391

@andygrove

Description

@andygrove

What is the problem the feature request solves?

In queries with InMemoryTableScanExec, we have to perform a CometColumnarToRowExec before this operator and then we perform a CometSparkRowToColumnar after the operator.

Here is an example plan from the test SparkToColumnar over InMemoryTableScanExec.

*(1) CometColumnarToRow
+- !CometHashAggregate [key#304L, count#371L], Final, [key#304L], [count(1)]
   +- !CometHashAggregate [key#304L], Partial, [key#304L], [partial_count(1)]
      +- CometSparkRowToColumnar
         +- Scan In-memory table abc [key#304L]
               +- InMemoryRelation [key#304L, value#305L, (key + 1)#308L], StorageLevel(disk, memory, deserialized, 1 replicas)
                     +- *(2) CometColumnarToRow
                        +- CometProject [key#6L, value#7L, (key + 1)#10L], [id#0L AS key#6L, (id#0L % 8) AS value#7L, (id#0L + 1) AS (key + 1)#10L]
                           +- CometSparkRowToColumnar
                              +- *(1) Range (0, 1000, step=1, splits=5)

Describe the potential solution

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions