数据接口

最后更新: 05/19/2025 (API docstrings 自动生成)。

DataProto 是数据交换的接口。

verl.DataProto 类包含两个关键成员：

batch: 一个 tensordict.TensorDict 对象，用于存放实际数据
meta_info: 一个 Dict，包含额外的元信息

TensorDict

DataProto.batch 构建在 tensordict 之上，这是一个 PyTorch 生态系统中的项目。 TensorDict 是一个类似字典的容器，用于存放张量（tensors）。要实例化一个 TensorDict，你必须指定键值对以及批次大小（batch size）。

>>> import torch
>>> from tensordict import TensorDict
>>> tensordict = TensorDict({"zeros": torch.zeros(2, 3, 4), "ones": torch.ones(2, 3, 5)}, batch_size=[2,])
>>> tensordict["twos"] = 2 * torch.ones(2, 5, 6)
>>> zeros = tensordict["zeros"]
>>> tensordict
TensorDict(
fields={
    ones: Tensor(shape=torch.Size([2, 3, 5]), device=cpu, dtype=torch.float32, is_shared=False),
    twos: Tensor(shape=torch.Size([2, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
    zeros: Tensor(shape=torch.Size([2, 3, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
batch_size=torch.Size([2]),
device=None,
is_shared=False)

也可以沿着 batch_size 对 tensordict 进行索引。TensorDict 的内容也可以被集体操作。

>>> tensordict[..., :1]
TensorDict(
fields={
    ones: Tensor(shape=torch.Size([1, 3, 5]), device=cpu, dtype=torch.float32, is_shared=False),
    twos: Tensor(shape=torch.Size([1, 5, 6]), device=cpu, dtype=torch.float32, is_shared=False),
    zeros: Tensor(shape=torch.Size([1, 3, 4]), device=cpu, dtype=torch.float32, is_shared=False)},
batch_size=torch.Size([1]),
device=None,
is_shared=False)
>>> tensordict = tensordict.to("cuda:0")
>>> tensordict = tensordict.reshape(6)

有关 tensordict.TensorDict 用法的更多信息，请参阅官方 tensordict 文档。

核心 API

class verl.DataProto(batch: ~tensordict._td.TensorDict = None, non_tensor_batch: dict = <factory>, meta_info: dict = <factory>)[source]

A DataProto is a data structure that aims to provide a standard protocol for data exchange between functions. It contains a batch (TensorDict) and a meta_info (Dict). The batch is a TensorDict https://pytorch.org/tensordict/. TensorDict allows you to manipulate a dictionary of Tensors like a single Tensor. Ideally, the tensors with the same batch size should be put inside batch.

static concat(data: list[DataProto]) → DataProto[source]

Concat a list of DataProto. The batch is concatenated among dim=0. The meta_info is merged, with special handling for metrics from different workers.

Parameters:: data (List[DataProto]) – list of DataProto
Returns:: concatenated DataProto
Return type:: DataProto

make_iterator(mini_batch_size, epochs, seed=None, dataloader_kwargs=None)[source]

Make an iterator from the DataProto. This is built upon that TensorDict can be used as a normal Pytorch dataset. See https://pytorch.org/tensordict/tutorials/data_fashion for more details.

Parameters:

mini_batch_size (int) – mini-batch size when iterating the dataset. We require that batch.batch_size[0] % mini_batch_size == 0.
epochs (int) – number of epochs when iterating the dataset.
dataloader_kwargs (Any) – internally, it returns a DataLoader over the batch. The dataloader_kwargs is the kwargs passed to the DataLoader.

Returns:

an iterator that yields a mini-batch data at a time. The total number of iteration: steps is self.batch.batch_size * epochs // mini_batch_size

Return type:

Iterator

select(batch_keys=None, non_tensor_batch_keys=None, meta_info_keys=None, deepcopy=False) → DataProto[source]

Select a subset of the DataProto via batch_keys and meta_info_keys

Parameters:

batch_keys (list, optional) – a list of strings indicating the keys in batch to select
meta_info_keys (list, optional) – a list of keys indicating the meta info to select

Returns:

the DataProto with the selected batch_keys and meta_info_keys

Return type:

DataProto

to(device) → DataProto[source]

move the batch to device

Parameters:: device (torch.device, str) – torch device
Returns:: the current DataProto
Return type:: DataProto

union(other: DataProto) → DataProto[source]

Union with another DataProto. Union batch and meta_info separately. Throw an error if

there are conflict keys in batch and they are not equal
the batch size of two data batch is not the same
there are conflict keys in meta_info and they are not the same.

Parameters:: other (DataProto) – another DataProto to union
Returns:: the DataProto after union
Return type:: DataProto