It seems like there's some integrations with ML pipelines (see this pull-request: https://github.com/duckdb/duckdb/pull/6295). I know there is a zero-copy conversion to an Arrow format, but I haven't seen any documentation regarding conversions to NumPy. Alternatively, Polars is another DataFrame library that has zero-copy conversions to NumPy arrays assuming there isn't any missing data. Hope that helps :)
Very cool, thanks for sharing - I will try it out!
Does the DuckDB package easily convert the final data to in-memory NumPy arrays (e.g. to run an sklearn or keras model afterwards)?
It seems like there's some integrations with ML pipelines (see this pull-request: https://github.com/duckdb/duckdb/pull/6295). I know there is a zero-copy conversion to an Arrow format, but I haven't seen any documentation regarding conversions to NumPy. Alternatively, Polars is another DataFrame library that has zero-copy conversions to NumPy arrays assuming there isn't any missing data. Hope that helps :)