数据接口
=========================

最后更新: 05/19/2025 (API docstrings 自动生成)。

DataProto 是数据交换的接口。

`verl.DataProto` 类包含两个关键成员：

- batch: 一个 `tensordict.TensorDict` 对象，用于存放实际数据
- meta_info: 一个 `Dict`，包含额外的元信息

TensorDict
~~~~~~~~~~~~

`DataProto.batch` 构建在 `tensordict` 之上，这是一个 PyTorch 生态系统中的项目。
TensorDict 是一个类似字典的容器，用于存放张量（tensors）。要实例化一个 TensorDict，你必须指定键值对以及批次大小（batch size）。

.. code-block:: python

    >>> import torch
    >>> from tensordict import TensorDict
    >>> tensordict = TensorDict({"zeros": torch.zeros(2, 3, 4), "ones": torch.ones(2, 3, 5)}, batch_size=[2,])
    >>> tensordict["twos"] = 2 * torch.ones(2, 5, 6)
    >>> zeros = tensordict["zeros"]
    >>> tensordict
    TensorDict(
    fields={
        ones: Tensor(shape=torch.Size([2, 3, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        twos: Tensor(shape=torch.Size([2, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
        zeros: Tensor(shape=torch.Size([2, 3, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([2]),
    device=None,
    is_shared=False)

也可以沿着 batch_size 对 tensordict 进行索引。TensorDict 的内容也可以被集体操作。

.. code-block:: python

    >>> tensordict[..., :1]
    TensorDict(
    fields={
        ones: Tensor(shape=torch.Size([1, 3, 5]), device=cpu, dtype=torch.float32, is_shared=False),
        twos: Tensor(shape=torch.Size([1, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
        zeros: Tensor(shape=torch.Size([1, 3, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
    batch_size=torch.Size([1]),
    device=None,
    is_shared=False)
    >>> tensordict = tensordict.to("cuda:0")
    >>> tensordict = tensordict.reshape(6)

有关 `tensordict.TensorDict` 用法的更多信息，请参阅官方 tensordict_ 文档。

.. _tensordict: https://pytorch.org/tensordict/overview.html


核心 API
~~~~~~~~~~~~~~~~~

.. autoclass::  verl.DataProto
   :members: to, select, union, make_iterator, concat