Skip to content

InfluxDBClient3.write_dataframe(df) reports success callback but no data is persisted in InfluxDB v3 #203

@0815IDIOT

Description

@0815IDIOT

Specifications

  • Client Version: influxdb3-python 0.18.0
  • InfluxDB Version: V3 Core
  • Platform: Unix

Hi guys, thank you for your amazing work!

I am experiencing an issue where writing data from a Pandas DataFrame via client.write_dataframe() in InfluxDB v3 appears to succeed (no exceptions, success callbacks are triggered), but no data is actually persisted in the database.
Interestingly, when I iterate through the DataFrame and write each row individually as a Point object, the data is written successfully.

Steps to Reproduce:

  • Create a Pandas DataFrame with a datetime64[ns] timestamp column.
  • Use InfluxDBClient3.write_dataframe().
  • Observe that the success callback is triggered.
  • Query the database: SELECT count(*) FROM returns 0.

Code sample to reproduce problem

from influxdb_client_3 import InfluxDBClient3, write_client_options, WriteOptions, InfluxDBError, WriteType, Point
import os
import pandas as pd
from dotenv import load_dotenv

load_dotenv()

influx_url = os.getenv("INFLUXDB_URL")
influx_token = os.getenv("INFLUXDB_TOKEN")
influx_db_name = os.getenv("INFLUXDB_DATABASE")

class BatchingCallback(object):

    def __init__(self):
        self.write_count = 0

    def success(self, conf, data: str):
        self.write_count += 1
        print(f"Written batch: {conf}, count: {self.write_count}")

    def error(self, conf, data: str, exception: InfluxDBError):
        print(f"Cannot write batch: {conf}, due: {exception}")

    def retry(self, conf, data: str, exception: InfluxDBError):
        print(f"Retryable error occurs for batch: {conf}, retry: {exception}")

callback = BatchingCallback()

write_options = WriteOptions(batch_size=5000,
    flush_interval=10_000,
    jitter_interval=2_000,
    retry_interval=5_000,
    max_retries=5,
    max_retry_delay=30_000,
    exponential_base=2)

wco = write_client_options(success_callback=callback.success,
    error_callback=callback.error,
    retry_callback=callback.retry,
    write_options=write_options
)

df = pd.DataFrame({
    'time': pd.to_datetime(['2024-01-01', '2024-01-02', '2024-01-03']),
    'trainer': ['Ash', 'Misty', 'Brock'],
    'pokemon_id': [25, 120, 74],
    'pokemon_name': ['Pikachu', 'Staryu', 'Geodude']
})

with InfluxDBClient3(host=influx_url, database=influx_db_name, token=influx_token, write_client_options=wco) as client:

    client.write_dataframe(
        df,
        measurement='caught',
        timestamp_column='time',
        tags=['trainer', 'pokemon_id']
    )

    """
    # working example 
    for _, row in df.iterrows():
        point = Point("caught").tag("trainer", row["trainer"]).tag("pokemon_id", row["pokemon_id"]).field("pokemon_name", row["pokemon_name"])
        client.write(point)
    """

Expected behavior

Query the database: SELECT count(*) FROM should return a value > 0.

Actual behavior

Query the database: SELECT count(*) FROM returns 0.

Additional info

No response

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions