Skip to content

Commit ae85ac9

Browse files
committed
Add notes for timestamp types
1 parent 4d3a6ed commit ae85ac9

File tree

1 file changed

+48
-46
lines changed

1 file changed

+48
-46
lines changed

mkdocs/docs/api.md

Lines changed: 48 additions & 46 deletions
Original file line numberDiff line numberDiff line change
@@ -2060,61 +2060,63 @@ import pyarrow as pa
20602060

20612061
#### PyIceberg to PyArrow type mapping
20622062

2063-
| PyIceberg type class | PyArrow type | Notes |
2064-
|---------------------------------|-------------------------------------|----------------------------------------|
2065-
| `BooleanType` | `pa.bool_()` | |
2066-
| `IntegerType` | `pa.int32()` | |
2067-
| `LongType` | `pa.int64()` | |
2068-
| `FloatType` | `pa.float32()` | |
2069-
| `DoubleType` | `pa.float64()` | |
2070-
| `DecimalType(p, s)` | `pa.decimal128(p, s)` | |
2071-
| `DateType` | `pa.date32()` | |
2072-
| `TimeType` | `pa.time64("us")` | |
2073-
| `TimestampType` | `pa.timestamp("us")` | |
2074-
| `TimestampNanoType` | `pa.timestamp("ns")` | |
2075-
| `TimestamptzType` | `pa.timestamp("us", tz="UTC")` | |
2076-
| `TimestamptzNanoType` | `pa.timestamp("ns", tz="UTC")` | |
2077-
| `StringType` | `pa.large_string()` | |
2078-
| `UUIDType` | `pa.uuid()` | |
2079-
| `BinaryType` | `pa.large_binary()` | |
2080-
| `FixedType(L)` | `pa.binary(L)` | |
2081-
| `StructType` | `pa.struct()` | |
2082-
| `ListType(e)` | `pa.large_list(e)` | |
2083-
| `MapType(k, v)` | `pa.map_(k, v)` | |
2084-
| `UnknownType` | `pa.null()` | |
2063+
| PyIceberg type class | PyArrow type |
2064+
|---------------------------------|-------------------------------------|
2065+
| `BooleanType` | `pa.bool_()` |
2066+
| `IntegerType` | `pa.int32()` |
2067+
| `LongType` | `pa.int64()` |
2068+
| `FloatType` | `pa.float32()` |
2069+
| `DoubleType` | `pa.float64()` |
2070+
| `DecimalType(p, s)` | `pa.decimal128(p, s)` |
2071+
| `DateType` | `pa.date32()` |
2072+
| `TimeType` | `pa.time64("us")` |
2073+
| `TimestampType` | `pa.timestamp("us")` |
2074+
| `TimestampNanoType` | `pa.timestamp("ns")` |
2075+
| `TimestamptzType` | `pa.timestamp("us", tz="UTC")` |
2076+
| `TimestamptzNanoType` | `pa.timestamp("ns", tz="UTC")` |
2077+
| `StringType` | `pa.large_string()` |
2078+
| `UUIDType` | `pa.uuid()` |
2079+
| `BinaryType` | `pa.large_binary()` |
2080+
| `FixedType(L)` | `pa.binary(L)` |
2081+
| `StructType` | `pa.struct()` |
2082+
| `ListType(e)` | `pa.large_list(e)` |
2083+
| `MapType(k, v)` | `pa.map_(k, v)` |
2084+
| `UnknownType` | `pa.null()` |
20852085

20862086
---
20872087

20882088
#### PyArrow to PyIceberg type mapping
20892089

2090-
| PyArrow type | PyIceberg type class | Notes |
2091-
|------------------------------------|-----------------------------|--------------------------------|
2092-
| `pa.bool_()` | `BooleanType` | |
2093-
| `pa.int32()` | `IntegerType` | |
2094-
| `pa.int64()` | `LongType` | |
2095-
| `pa.float32()` | `FloatType` | |
2096-
| `pa.float64()` | `DoubleType` | |
2097-
| `pa.decimal128(p, s)` | `DecimalType(p, s)` | |
2098-
| `pa.decimal256(p, s)` | Unsupported | |
2099-
| `pa.date32()` | `DateType` | |
2100-
| `pa.date64()` | Unsupported | |
2101-
| `pa.time64("us")` | `TimeType` | |
2102-
| `pa.timestamp("us")` | `TimestampType` | |
2103-
| `pa.timestamp("ns")` | `TimestampNanoType` | |
2104-
| `pa.timestamp("us", tz="UTC")` | `TimestamptzType` | |
2105-
| `pa.timestamp("ns", tz="UTC")` | `TimestamptzNanoType` | |
2106-
| `pa.string()` / `pa.large_string()`| `StringType` | |
2107-
| `pa.uuid()` | `UUIDType` | |
2108-
| `pa.binary()` / `pa.large_binary()`| `BinaryType` | |
2109-
| `pa.binary(L)` | `FixedType(L)` | Fixed-length byte arrays |
2110-
| `pa.struct([...])` | `StructType` | |
2111-
| `pa.list_(e)` / `pa.large_list(e)` | `ListType(e)` | |
2112-
| `pa.map_(k, v)` | `MapType(k, v)` | |
2113-
| `pa.null()` | `UnknownType` | |
2090+
| PyArrow type | PyIceberg type class |
2091+
|------------------------------------|-----------------------------|
2092+
| `pa.bool_()` | `BooleanType` |
2093+
| `pa.int32()` | `IntegerType` |
2094+
| `pa.int64()` | `LongType` |
2095+
| `pa.float32()` | `FloatType` |
2096+
| `pa.float64()` | `DoubleType` |
2097+
| `pa.decimal128(p, s)` | `DecimalType(p, s)` |
2098+
| `pa.decimal256(p, s)` | Unsupported |
2099+
| `pa.date32()` | `DateType` |
2100+
| `pa.date64()` | Unsupported |
2101+
| `pa.time64("us")` | `TimeType` |
2102+
| `pa.timestamp("us")` | `TimestampType` |
2103+
| `pa.timestamp("ns")` | `TimestampNanoType` |
2104+
| `pa.timestamp("us", tz="UTC")` | `TimestamptzType` |
2105+
| `pa.timestamp("ns", tz="UTC")` | `TimestamptzNanoType` |
2106+
| `pa.string()` / `pa.large_string()`| `StringType` |
2107+
| `pa.uuid()` | `UUIDType` |
2108+
| `pa.binary()` / `pa.large_binary()`| `BinaryType` |
2109+
| `pa.binary(L)` | `FixedType(L)` |
2110+
| `pa.struct([...])` | `StructType` |
2111+
| `pa.list_(e)` / `pa.large_list(e)` | `ListType(e)` |
2112+
| `pa.map_(k, v)` | `MapType(k, v)` |
2113+
| `pa.null()` | `UnknownType` |
21142114

21152115
---
21162116

21172117
***Notes***
21182118

21192119
- PyIceberg `GeometryType` and `GeographyType` types are mapped to a GeoArrow WKB extension type.
21202120
Otherwise, falls back to `pa.large_binary()` which stores WKB bytes.
2121+
- For timestamp types (`TimestampNanoType`, `TimestamptzType`, `TimestamptzNanoType`), writing in format version 3 (which supports the `ns` unit) is not yet implemented
2122+
(see [Github issue](https://github.com/apache/iceberg-python/issues/1551)). Only the `UTC` timezone and its aliases are supported.

0 commit comments

Comments
 (0)