Skip to content

Conversation

@timmyyao
Copy link
Contributor

@timmyyao timmyyao commented Jan 4, 2026

Purpose

We tend to persist files of Paimon table on object storage such as S3 and OSS. However the throughput or latency of such storage is usually limited, making IO performance become the bottleneck in ETL or OLAP occasions with high concurrency. Thus we introduce cache when FileIO visits files within a Paimon table. The solution is to support JindoCache in JindoFileIO, which is a distributed filesystem cache system. And table-level cache policy is implemented, which can be supported by REST server such as DLF.

@sd4324530
Copy link
Contributor

@timmyyao
I'm not using DLF, but the data is still stored on OSS. Does this also apply?
like this:

CREATE CATALOG paimon_catalog WITH (
'type' = 'paimon',
'metastore' = 'hive',
'warehouse' = 'oss://xxxx.oss-dls.aliyuncs.com/user/hive/warehouse'
);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants