Skip to content

unified-memory-management-spark-10000.pdf #2

@frostbyte134

Description

@frostbyte134

https://issues.apache.org/jira/browse/SPARK-10000

한글번역 https://medium.com/@leeyh0216/spark-internal-part-2-spark%EC%9D%98-%EB%A9%94%EB%AA%A8%EB%A6%AC-%EA%B4%80%EB%A6%AC-2-db1975b74d2f

보기전 참고 https://dhkdn9192.github.io/apache-spark/spark_executor_memory_structure/#3-1-storage-memory

Proposal

boundary between execution and storage is now crossable

  • execution mem can borrow storage mem (and vice versa)
  • borrowed storage mem can be evicted at any time
  • borrowed execution mem wont be evicted (more critical perhaps)

https://spark.apache.org/docs/latest/configuration.html#memory-management

  • spark.memory.storageFraction
    • Amount of storage memory immune to eviction (?!)
    • higher -> less for execution mem -> more spill
    • spill은 execution 쪽에서 일어나는 듯?

Eviction priority (Prefer to evict...)

  • storage memory: storage lvl에 따라 다름
    • MEMORY_ONLY를 evict시 다 날아가기 때문에 재계산 필요 - 가장 비쌈
    • MEMORY_AND_DISK_SER: 스토리지 - 메모리 사이에서 다시 serialize할 필요 없기 때문에 가장 쌈
  • execution memory
    • no risk of recomputation, since all data will be spilled to disk (?!)
    • spilled execution memory will always be re-read again to memory , while storage data may not be reference again. in such case eviction cost for exec is higher
  • No eviction preference

Impl complexity

storage eviction is simple: can reuse prev methods

execution eviction is relatively hard

  • register a spill callback (spill시 어떻게 해야 할 지 각 executor가 정의하게?)
  • cooperatively poll and spill (executor마다 계속 폴링하기?)

대부분의 operator들은 최소한의 공간이 항상 확보되있다고 가정

  • 이걸 보장해 줘야 함. 쉽지않음

또한, execution memory가 evict되는 동안 cache에 올라오길 기다리는 블록들을 어케 해야 할 건지

  1. execution mem이 충분히 evict되길 기다리기 -> 데드락 가능성이 있음 (ex - 새로 들어온 블록이 엄청난 execution mem을 필요로 하는 경우 - 그런가?)
  2. 블록을 디스크에 일단 쓰고, 충분히 메모리가 확보될때까지 기다리기. 모든 블록을 디스크에 다 써버리는 경우를 대비해서 버퍼를 확보해야 함

evict cache data

those two approaches (execution / storage eviction) brings additional complexity

  • introduced evict cache data

motivations

  • provide more mem to execution (less pressure to execution mem)
  • simpler

dynamic minimum reservation

for storage, it can borrow from execution but may be evicted as soon as execution attempts to claim the space back

for execution, the same is true with one exception

  • when execution memory already uses all the storage space and an app tries to cache a block, we simply evict the new block instead of attempting to evict execution mem (complicated)

1 - spark.memory.fraction = internal metadata, user data structures (non execution and storage)

proposed design (after 1.5?)

can rollback with spark.memory.useLegacyMode

  1. existing: restrict execution / storage to their own respsective memory space
    no cross-region eviction / spilling. static configuration.
  2. evict cached blocks, full fuidity
    • one unified region
    • always evict cache blocks
    • execution mem spills only when theres not enough space after evicting storage memory
  3. evict cached blocks, static storage reservation - 2 + reserved storage mem (set by spark.memory.storageFraction, default 0.0)
  4. evict cached blocks, dynamic storage reservation: like 2, but
    • storage space is not statically reserved, but dynamically allocated.
    • execution can borrow when storage space is not used

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions