gluonts.dataset.arrow 包#
Arrow 数据集#
使用 pyarrow 实现快速高效的数据集。
此模块提供三种文件类型:
ArrowFile(arrow 随机访问二进制格式)
ArrowStreamFile(arrow 流式二进制格式)
ParquetFile
- class gluonts.dataset.arrow.ArrowFile(path: pathlib.Path, _start: int = 0, _take: Union[int, NoneType] = None)[source]#
- 继承自: - gluonts.dataset.arrow.file.File- property batch_offsets#
 - path: pathlib.Path#
 - reader: pyarrow.ipc.RecordBatchFileReader#
 - property schema#
 
- class gluonts.dataset.arrow.ArrowStreamFile(path: pathlib.Path, _start: int = 0, _take: Union[int, NoneType] = None)[source]#
- 继承自: - gluonts.dataset.arrow.file.File- path: pathlib.Path#
 
- class gluonts.dataset.arrow.ArrowWriter(stream: bool = False, suffix: str = '.feather', compression: Union[typing_extensions.Literal['lz4'], typing_extensions.Literal['zstd'], NoneType] = None, flatten_arrays: bool = True, metadata: Union[dict, NoneType] = None)[source]#
- 继承自: - gluonts.dataset.DatasetWriter- compression: Optional[Union[typing_extensions.Literal[lz4], typing_extensions.Literal[zstd]]] = None#
 - flatten_arrays: bool = True#
 - metadata: Optional[dict] = None#
 - stream: bool = False#
 - suffix: str = '.feather'#
 - write_to_file(dataset: gluonts.dataset.Dataset, path: pathlib.Path) None[source]#
 - write_to_folder(dataset: gluonts.dataset.Dataset, folder: pathlib.Path, name: Optional[str] = None) None[source]#
 
- class gluonts.dataset.arrow.File[source]#
- 继承自: - object- SUFFIXES = {'.arrow', '.feather', '.parquet'}#
 - static infer(path: pathlib.Path) Union[gluonts.dataset.arrow.file.ArrowFile, gluonts.dataset.arrow.file.ArrowStreamFile, gluonts.dataset.arrow.file.ParquetFile][source]#
- 通过检查提供的路径返回 ArrowFile、ArrowStreamFile 或 ParquetFile。 - Arrow 的 random-access 格式以 ARROW1 开头,因此我们查看提供的文件以确定格式。 
 
- class gluonts.dataset.arrow.ParquetFile(path: pathlib.Path, _start: int = 0, _take: Union[int, NoneType] = None, _row_group_sizes: List[int] = <factory>)[source]#
- 继承自: - gluonts.dataset.arrow.file.File- path: pathlib.Path#
 - reader: pyarrow.parquet.core.ParquetFile#
 
- class gluonts.dataset.arrow.ParquetWriter(suffix: str = '.parquet', flatten_arrays: bool = True, metadata: Union[dict, NoneType] = None)[source]#
- 继承自: - gluonts.dataset.DatasetWriter- flatten_arrays: bool = True#
 - metadata: Optional[dict] = None#
 - suffix: str = '.parquet'#
 - write_to_file(dataset: gluonts.dataset.Dataset, path: pathlib.Path) None[source]#
 - write_to_folder(dataset: gluonts.dataset.Dataset, folder: pathlib.Path, name: Optional[str] = None) None[source]#