安全的關(guān)鍵.png)
使用這些基本 REST API 最佳實(shí)踐構(gòu)建出色的 API
首先,讓我們安裝必要的庫(kù): Transformers、Datasets、Evaluate、Accelerate 和 GluonTS。
正如我們將展示的那樣,GluonTS 將用于轉(zhuǎn)換數(shù)據(jù)以創(chuàng)建特征以及創(chuàng)建適當(dāng)?shù)挠?xùn)練、驗(yàn)證和測(cè)試批次。
!pip install -q transformers
!pip install -q datasets
!pip install -q evaluate
!pip install -q accelerate
!pip install -q gluonts ujson
在這篇博文中,我們將使用 Hugging Face Hub 上提供的 tourism_monthly 數(shù)據(jù)集。該數(shù)據(jù)集包含澳大利亞 366 個(gè)地區(qū)的每月旅游流量。
此數(shù)據(jù)集是 Monash Time Series Forecasting 存儲(chǔ)庫(kù)的一部分,該存儲(chǔ)庫(kù)收納了是來(lái)自多個(gè)領(lǐng)域的時(shí)間序列數(shù)據(jù)集。它可以看作是時(shí)間序列預(yù)測(cè)的 GLUE 基準(zhǔn)。
from datasets import load_dataset
dataset = load_dataset("monash_tsf", "tourism_monthly")
可以看出,數(shù)據(jù)集包含 3 個(gè)片段: 訓(xùn)練、驗(yàn)證和測(cè)試。
dataset
DatasetDict({
train: Dataset({
features: ['start', 'target', 'feat_static_cat', 'feat_dynamic_real', 'item_id'],
num_rows: 366
})
test: Dataset({
features: ['start', 'target', 'feat_static_cat', 'feat_dynamic_real', 'item_id'],
num_rows: 366
})
validation: Dataset({
features: ['start', 'target', 'feat_static_cat', 'feat_dynamic_real', 'item_id'],
num_rows: 366
})
})
每個(gè)示例都包含一些鍵,其中 start 和 target 是最重要的鍵。讓我們看一下數(shù)據(jù)集中的第一個(gè)時(shí)間序列:
train_example = dataset['train'][0]
train_example.keys()
dict_keys(['start', 'target', 'feat_static_cat', 'feat_dynamic_real', 'item_id'])
start 僅指示時(shí)間序列的開(kāi)始 (類型為 datetime) ,而 target 包含時(shí)間序列的實(shí)際值。
start 將有助于將時(shí)間相關(guān)的特征添加到時(shí)間序列值中,作為模型的額外輸入 (例如“一年中的月份”) 。因?yàn)槲覀円呀?jīng)知道數(shù)據(jù)的頻率是?每月,所以也能推算第二個(gè)值的時(shí)間戳為 1979-02-01,等等。
print(train_example['start'])
print(train_example['target'])
1979-01-01 00:00:00
[1149.8699951171875, 1053.8001708984375, ..., 5772.876953125]
驗(yàn)證集包含與訓(xùn)練集相同的數(shù)據(jù),只是數(shù)據(jù)時(shí)間范圍延長(zhǎng)了 prediction_length 那么多。這使我們能夠根據(jù)真實(shí)情況驗(yàn)證模型的預(yù)測(cè)。
與驗(yàn)證集相比,測(cè)試集還是比驗(yàn)證集多包含 prediction_length 時(shí)間的數(shù)據(jù) (或者使用比訓(xùn)練集多出數(shù)個(gè) prediction_length 時(shí)長(zhǎng)數(shù)據(jù)的測(cè)試集,實(shí)現(xiàn)在多重滾動(dòng)窗口上的測(cè)試任務(wù))。
validation_example = dataset['validation'][0]
validation_example.keys()
dict_keys(['start', 'target', 'feat_static_cat', 'feat_dynamic_real', 'item_id'])
驗(yàn)證的初始值與相應(yīng)的訓(xùn)練示例完全相同:
print(validation_example['start'])
print(validation_example['target'])
1979-01-01 00:00:00
[1149.8699951171875, 1053.8001708984375, ..., 5985.830078125]
但是,與訓(xùn)練示例相比,此示例具有 prediction_length=24 個(gè)額外的數(shù)據(jù)。讓我們驗(yàn)證一下。
freq = "1M"
prediction_length = 24
assert len(train_example["target"]) + prediction_length == len(
validation_example["target"]
)
讓我們可視化一下:
import matplotlib.pyplot as plt
figure, axes = plt.subplots()
axes.plot(train_example["target"], color="blue")
axes.plot(validation_example["target"], color="red", alpha=0.5)
plt.show()
我們要做的第一件事是根據(jù)數(shù)據(jù)的?freq?值將每個(gè)時(shí)間序列的?start?特征轉(zhuǎn)換為 pandas 的?Period?索引:
from functools import lru_cache
import pandas as pd
import numpy as np
@lru_cache(10_000)
def convert_to_pandas_period(date, freq):
return pd.Period(date, freq)
def transform_start_field(batch, freq):
batch["start"] = [convert_to_pandas_period(date, freq) for date in batch["start"]]
return batch
這里我們使用?datasets?的?set_transform?來(lái)實(shí)現(xiàn):
from functools import partial
train_dataset.set_transform(partial(transform_start_field, freq=freq))
test_dataset.set_transform(partial(transform_start_field, freq=freq))
接下來(lái),讓我們實(shí)例化一個(gè)模型。該模型將從頭開(kāi)始訓(xùn)練,因此我們不使用 from_pretrained 方法,而是從 config 中隨機(jī)初始化模型。
我們?yōu)槟P椭付藥讉€(gè)附加參數(shù):
讓我們使用 GluonTS 為給定頻率 (“每月”) 提供的默認(rèn)滯后值:
from gluonts.time_feature import get_lags_for_frequency
lags_sequence = get_lags_for_frequency(freq)
print(lags_sequence)
>>> [1, 2, 3, 4, 5, 6, 7, 11, 12, 13, 23, 24, 25, 35, 36, 37]
這意味著我們每個(gè)時(shí)間步將回顧長(zhǎng)達(dá) 37 個(gè)月的數(shù)據(jù),作為附加特征。我們還檢查 GluonTS 為我們提供的默認(rèn)時(shí)間特征:
from gluonts.time_feature import time_features_from_frequency_str
time_features = time_features_from_frequency_str(freq)
print(time_features)
>>> [<function month_of_year at 0x7fa496d0ca70>]
在這種情況下,只有一個(gè)特征,即“一年中的月份”。這意味著對(duì)于每個(gè)時(shí)間步長(zhǎng),我們將添加月份作為標(biāo)量值 (例如,如果時(shí)間戳為 “january”,則為 1;如果時(shí)間戳為 “february”,則為 2,等等) 。
我們現(xiàn)在準(zhǔn)備好定義模型需要的所有內(nèi)容了:
from transformers import TimeSeriesTransformerConfig, TimeSeriesTransformerForPrediction
config = TimeSeriesTransformerConfig(
prediction_length=prediction_length,
# context length:
context_length=prediction_length * 2,
# lags coming from helper given the freq:
lags_sequence=lags_sequence,
# we'll add 2 time features ("month of year" and "age", see further):
num_time_features=len(time_features) + 1,
# we have a single static categorical feature, namely time series ID:
num_static_categorical_features=1,
# it has 366 possible values:
cardinality=[len(train_dataset)],
# the model will learn an embedding of size 2 for each of the 366 possible values:
embedding_dimension=[2],
# transformer params:
encoder_layers=4,
decoder_layers=4,
d_model=32,
)
model = TimeSeriesTransformerForPrediction(config)
請(qǐng)注意,與 Transformers 庫(kù)中的其他模型類似,TimeSeriesTransformerModel 對(duì)應(yīng)于沒(méi)有任何頂部前置頭的編碼器-解碼器 Transformer,而 TimeSeriesTransformerForPrediction 對(duì)應(yīng)于頂部有一個(gè)分布前置頭 (distribution head) 的 TimeSeriesTransformerForPrediction。默認(rèn)情況下,該模型使用 Student-t 分布 (也可以自行配置):
model.config.distribution_output
>>> student_t
這是具體實(shí)現(xiàn)層面與用于 NLP 的 Transformers 的一個(gè)重要區(qū)別,其中頭部通常由一個(gè)固定的分類分布組成,實(shí)現(xiàn)為 nn.Linear 層。
接下來(lái),我們定義數(shù)據(jù)的轉(zhuǎn)換,尤其是需要基于樣本數(shù)據(jù)集或通用數(shù)據(jù)集來(lái)創(chuàng)建其中的時(shí)間特征。
同樣,我們用到了 GluonTS 庫(kù)。這里定義了一個(gè) Chain (有點(diǎn)類似于圖像訓(xùn)練的 torchvision.transforms.Compose) 。它允許我們將多個(gè)轉(zhuǎn)換組合到一個(gè)流水線中。
from gluonts.time_feature import (
time_features_from_frequency_str,
TimeFeature,
get_lags_for_frequency,
)
from gluonts.dataset.field_names import FieldName
from gluonts.transform import (
AddAgeFeature,
AddObservedValuesIndicator,
AddTimeFeatures,
AsNumpyArray,
Chain,
ExpectedNumInstanceSampler,
InstanceSplitter,
RemoveFields,
SelectFields,
SetField,
TestSplitSampler,
Transformation,
ValidationSplitSampler,
VstackFeatures,
RenameFields,
)
下面的轉(zhuǎn)換代碼帶有注釋供大家查看具體的操作步驟。從全局來(lái)說(shuō),我們將迭代數(shù)據(jù)集的各個(gè)時(shí)間序列并添加、刪除某些字段或特征:
from transformers import PretrainedConfig
def create_transformation(freq: str, config: PretrainedConfig) -> Transformation:
remove_field_names = []
if config.num_static_real_features == 0:
remove_field_names.append(FieldName.FEAT_STATIC_REAL)
if config.num_dynamic_real_features == 0:
remove_field_names.append(FieldName.FEAT_DYNAMIC_REAL)
if config.num_static_categorical_features == 0:
remove_field_names.append(FieldName.FEAT_STATIC_CAT)
# a bit like torchvision.transforms.Compose
return Chain(
# step 1: remove static/dynamic fields if not specified
[RemoveFields(field_names=remove_field_names)]
# step 2: convert the data to NumPy (potentially not needed)
+ (
[
AsNumpyArray(
field=FieldName.FEAT_STATIC_CAT,
expected_ndim=1,
dtype=int,
)
]
if config.num_static_categorical_features > 0
else []
)
+ (
[
AsNumpyArray(
field=FieldName.FEAT_STATIC_REAL,
expected_ndim=1,
)
]
if config.num_static_real_features > 0
else []
)
+ [
AsNumpyArray(
field=FieldName.TARGET,
# we expect an extra dim for the multivariate case:
expected_ndim=1 if config.input_size == 1 else 2,
),
# step 3: handle the NaN's by filling in the target with zero
# and return the mask (which is in the observed values)
# true for observed values, false for nan's
# the decoder uses this mask (no loss is incurred for unobserved values)
# see loss_weights inside the xxxForPrediction model
AddObservedValuesIndicator(
target_field=FieldName.TARGET,
output_field=FieldName.OBSERVED_VALUES,
),
# step 4: add temporal features based on freq of the dataset
# month of year in the case when freq="M"
# these serve as positional encodings
AddTimeFeatures(
start_field=FieldName.START,
target_field=FieldName.TARGET,
output_field=FieldName.FEAT_TIME,
time_features=time_features_from_frequency_str(freq),
pred_length=config.prediction_length,
),
# step 5: add another temporal feature (just a single number)
# tells the model where in its life the value of the time series is,
# sort of a running counter
AddAgeFeature(
target_field=FieldName.TARGET,
output_field=FieldName.FEAT_AGE,
pred_length=config.prediction_length,
log_scale=True,
),
# step 6: vertically stack all the temporal features into the key FEAT_TIME
VstackFeatures(
output_field=FieldName.FEAT_TIME,
input_fields=[FieldName.FEAT_TIME, FieldName.FEAT_AGE]
+ (
[FieldName.FEAT_DYNAMIC_REAL]
if config.num_dynamic_real_features > 0
else []
),
),
# step 7: rename to match HuggingFace names
RenameFields(
mapping={
FieldName.FEAT_STATIC_CAT: "static_categorical_features",
FieldName.FEAT_STATIC_REAL: "static_real_features",
FieldName.FEAT_TIME: "time_features",
FieldName.TARGET: "values",
FieldName.OBSERVED_VALUES: "observed_mask",
}
),
]
)
對(duì)于訓(xùn)練、驗(yàn)證、測(cè)試步驟,接下來(lái)我們創(chuàng)建一個(gè) InstanceSplitter,用于從數(shù)據(jù)集中對(duì)窗口進(jìn)行采樣 (因?yàn)橛捎跁r(shí)間和內(nèi)存限制,我們無(wú)法將整個(gè)歷史值傳遞給 Transformer)。
實(shí)例拆分器從數(shù)據(jù)中隨機(jī)采樣大小為 context_length 和后續(xù)大小為 prediction_length 的窗口,并將 past_?或 future_?鍵附加到各個(gè)窗口的任何臨時(shí)鍵。這確保了 values 被拆分為 past_values 和后續(xù)的 future_values 鍵,它們將分別用作編碼器和解碼器的輸入。同樣我們還需要修改 time_series_fields 參數(shù)中的所有鍵:
from gluonts.transform.sampler import InstanceSampler
from typing import Optional
def create_instance_splitter(
config: PretrainedConfig,
mode: str,
train_sampler: Optional[InstanceSampler] = None,
validation_sampler: Optional[InstanceSampler] = None,
) -> Transformation:
assert mode in ["train", "validation", "test"]
instance_sampler = {
"train": train_sampler
or ExpectedNumInstanceSampler(
num_instances=1.0, min_future=config.prediction_length
),
"validation": validation_sampler
or ValidationSplitSampler(min_future=config.prediction_length),
"test": TestSplitSampler(),
}[mode]
return InstanceSplitter(
target_field="values",
is_pad_field=FieldName.IS_PAD,
start_field=FieldName.START,
forecast_start_field=FieldName.FORECAST_START,
instance_sampler=instance_sampler,
past_length=config.context_length + max(config.lags_sequence),
future_length=config.prediction_length,
time_series_fields=["time_features", "observed_mask"],
)
有了數(shù)據(jù),下一步需要?jiǎng)?chuàng)建 PyTorch DataLoaders。它允許我們批量處理成對(duì)的 (輸入, 輸出) 數(shù)據(jù),即 (past_values, future_values)。
from typing import Iterable
import torch
from gluonts.itertools import Cached, Cyclic
from gluonts.dataset.loader import as_stacked_batches
def create_train_dataloader(
config: PretrainedConfig,
freq,
data,
batch_size: int,
num_batches_per_epoch: int,
shuffle_buffer_length: Optional[int] = None,
cache_data: bool = True,
**kwargs,
) -> Iterable:
PREDICTION_INPUT_NAMES = [
"past_time_features",
"past_values",
"past_observed_mask",
"future_time_features",
]
if config.num_static_categorical_features > 0:
PREDICTION_INPUT_NAMES.append("static_categorical_features")
if config.num_static_real_features > 0:
PREDICTION_INPUT_NAMES.append("static_real_features")
TRAINING_INPUT_NAMES = PREDICTION_INPUT_NAMES + [
"future_values",
"future_observed_mask",
]
transformation = create_transformation(freq, config)
transformed_data = transformation.apply(data, is_train=True)
if cache_data:
transformed_data = Cached(transformed_data)
# we initialize a Training instance
instance_splitter = create_instance_splitter(config, "train")
# the instance splitter will sample a window of
# context length + lags + prediction length (from the 366 possible transformed time series)
# randomly from within the target time series and return an iterator.
stream = Cyclic(transformed_data).stream()
training_instances = instance_splitter.apply(
stream, is_train=True
)
return as_stacked_batches(
training_instances,
batch_size=batch_size,
shuffle_buffer_length=shuffle_buffer_length,
field_names=TRAINING_INPUT_NAMES,
output_type=torch.tensor,
num_batches_per_epoch=num_batches_per_epoch,
)
def create_test_dataloader(
config: PretrainedConfig,
freq,
data,
batch_size: int,
**kwargs,
):
PREDICTION_INPUT_NAMES = [
"past_time_features",
"past_values",
"past_observed_mask",
"future_time_features",
]
if config.num_static_categorical_features > 0:
PREDICTION_INPUT_NAMES.append("static_categorical_features")
if config.num_static_real_features > 0:
PREDICTION_INPUT_NAMES.append("static_real_features")
transformation = create_transformation(freq, config)
transformed_data = transformation.apply(data, is_train=False)
# we create a Test Instance splitter which will sample the very last
# context window seen during training only for the encoder.
instance_sampler = create_instance_splitter(config, "test")
# we apply the transformations in test mode
testing_instances = instance_sampler.apply(transformed_data, is_train=False)
return as_stacked_batches(
testing_instances,
batch_size=batch_size,
output_type=torch.tensor,
field_names=PREDICTION_INPUT_NAMES,
)
train_dataloader = create_train_dataloader(
config=config,
freq=freq,
data=train_dataset,
batch_size=256,
num_batches_per_epoch=100,
)
test_dataloader = create_test_dataloader(
config=config,
freq=freq,
data=test_dataset,
batch_size=64,
)
讓我們檢查第一批:
batch = next(iter(train_dataloader))
for k, v in batch.items():
print(k, v.shape, v.type())
>>> past_time_features torch.Size([256, 85, 2]) torch.FloatTensor
past_values torch.Size([256, 85]) torch.FloatTensor
past_observed_mask torch.Size([256, 85]) torch.FloatTensor
future_time_features torch.Size([256, 24, 2]) torch.FloatTensor
static_categorical_features torch.Size([256, 1]) torch.LongTensor
future_values torch.Size([256, 24]) torch.FloatTensor
future_observed_mask torch.Size([256, 24]) torch.FloatTensor
可以看出,我們沒(méi)有將 input_ids 和 attention_mask 提供給編碼器 (訓(xùn)練 NLP 模型時(shí)也是這種情況),而是提供 past_values,以及 past_observed_mask、past_time_features、static_categorical_features 和 static_real_features 幾項(xiàng)數(shù)據(jù)。
解碼器的輸入包括 future_values、future_observed_mask 和 future_time_features。future_values 可以看作等同于 NLP 訓(xùn)練中的 decoder_input_ids。
讓我們對(duì)剛剛創(chuàng)建的批次執(zhí)行一次前向傳播:
# perform forward pass
outputs = model(
past_values=batch["past_values"],
past_time_features=batch["past_time_features"],
past_observed_mask=batch["past_observed_mask"],
static_categorical_features=batch["static_categorical_features"]
if config.num_static_categorical_features > 0
else None,
static_real_features=batch["static_real_features"]
if config.num_static_real_features > 0
else None,
future_values=batch["future_values"],
future_time_features=batch["future_time_features"],
future_observed_mask=batch["future_observed_mask"],
output_hidden_states=True,
)
print("Loss:", outputs.loss.item())
>>> Loss: 9.069628715515137
目前,該模型返回了損失值。這是由于解碼器會(huì)自動(dòng)將 future_values 向右移動(dòng)一個(gè)位置以獲得標(biāo)簽。這允許計(jì)算預(yù)測(cè)結(jié)果和標(biāo)簽值之間的誤差。
另請(qǐng)注意,解碼器使用 Causal Mask 來(lái)避免預(yù)測(cè)未來(lái),因?yàn)樗枰A(yù)測(cè)的值在 future_values 張量中。
是時(shí)候訓(xùn)練模型了!我們將使用標(biāo)準(zhǔn)的 PyTorch 訓(xùn)練循環(huán)。
這里我們用到了 Accelerate 庫(kù),它會(huì)自動(dòng)將模型、優(yōu)化器和數(shù)據(jù)加載器放置在適當(dāng)?shù)?device 上。
from accelerate import Accelerator
from torch.optim import AdamW
accelerator = Accelerator()
device = accelerator.device
model.to(device)
optimizer = AdamW(model.parameters(), lr=6e-4, betas=(0.9, 0.95), weight_decay=1e-1)
model, optimizer, train_dataloader = accelerator.prepare(
model,
optimizer,
train_dataloader,
)
model.train()
for epoch in range(40):
for idx, batch in enumerate(train_dataloader):
optimizer.zero_grad()
outputs = model(
static_categorical_features=batch["static_categorical_features"].to(device)
if config.num_static_categorical_features > 0
else None,
static_real_features=batch["static_real_features"].to(device)
if config.num_static_real_features > 0
else None,
past_time_features=batch["past_time_features"].to(device),
past_values=batch["past_values"].to(device),
future_time_features=batch["future_time_features"].to(device),
future_values=batch["future_values"].to(device),
past_observed_mask=batch["past_observed_mask"].to(device),
future_observed_mask=batch["future_observed_mask"].to(device),
)
loss = outputs.loss
# Backpropagation
accelerator.backward(loss)
optimizer.step()
if idx % 100 == 0:
print(loss.item())
在推理時(shí),建議使用 generate()?方法進(jìn)行自回歸生成,類似于 NLP 模型。
預(yù)測(cè)的過(guò)程會(huì)從測(cè)試實(shí)例采樣器中獲得數(shù)據(jù)。采樣器會(huì)將數(shù)據(jù)集的每個(gè)時(shí)間序列的最后 context_length 那么長(zhǎng)時(shí)間的數(shù)據(jù)采樣出來(lái),然后輸入模型。請(qǐng)注意,這里需要把提前已知的 future_time_features 傳遞給解碼器。
該模型將從預(yù)測(cè)分布中自回歸采樣一定數(shù)量的值,并將它們傳回解碼器最終得到預(yù)測(cè)輸出:
model.eval()
forecasts = []
for batch in test_dataloader:
outputs = model.generate(
static_categorical_features=batch["static_categorical_features"].to(device)
if config.num_static_categorical_features > 0
else None,
static_real_features=batch["static_real_features"].to(device)
if config.num_static_real_features > 0
else None,
past_time_features=batch["past_time_features"].to(device),
past_values=batch["past_values"].to(device),
future_time_features=batch["future_time_features"].to(device),
past_observed_mask=batch["past_observed_mask"].to(device),
)
forecasts.append(outputs.sequences.cpu().numpy())
該模型輸出一個(gè)表示結(jié)構(gòu)的張量 (batch_size, number of samples, prediction length)。
下面的輸出說(shuō)明: 對(duì)于大小為?64?的批次中的每個(gè)示例,我們將獲得接下來(lái)?24?個(gè)月內(nèi)的?100?個(gè)可能的值:
forecasts[0].shape
>>> (64, 100, 24)
我們將垂直堆疊它們,以獲得測(cè)試數(shù)據(jù)集中所有時(shí)間序列的預(yù)測(cè):
forecasts = np.vstack(forecasts)
print(forecasts.shape)
>>> (366, 100, 24)
我們可以根據(jù)測(cè)試集中存在的樣本值,根據(jù)真實(shí)情況評(píng)估生成的預(yù)測(cè)。這里我們使用數(shù)據(jù)集中的每個(gè)時(shí)間序列的 MASE 和 sMAPE 指標(biāo) (metrics) 來(lái)評(píng)估:
from evaluate import load
from gluonts.time_feature import get_seasonality
mase_metric = load("evaluate-metric/mase")
smape_metric = load("evaluate-metric/smape")
forecast_median = np.median(forecasts, 1)
mase_metrics = []
smape_metrics = []
for item_id, ts in enumerate(test_dataset):
training_data = ts["target"][:-prediction_length]
ground_truth = ts["target"][-prediction_length:]
mase = mase_metric.compute(
predictions=forecast_median[item_id],
references=np.array(ground_truth),
training=np.array(training_data),
periodicity=get_seasonality(freq))
mase_metrics.append(mase["mase"])
smape = smape_metric.compute(
predictions=forecast_median[item_id],
references=np.array(ground_truth),
)
smape_metrics.append(smape["smape"])
print(f"MASE: {np.mean(mase_metrics)}")
>>> MASE: 1.2564196892177717
print(f"sMAPE: {np.mean(smape_metrics)}")
>>> sMAPE: 0.1609541520852549
我們還可以單獨(dú)繪制數(shù)據(jù)集中每個(gè)時(shí)間序列的結(jié)果指標(biāo),并觀察到其中少數(shù)時(shí)間序列對(duì)最終測(cè)試指標(biāo)的影響很大:
plt.scatter(mase_metrics, smape_metrics, alpha=0.3)
plt.xlabel("MASE")
plt.ylabel("sMAPE")
plt.show()
為了根據(jù)基本事實(shí)測(cè)試數(shù)據(jù)繪制任何時(shí)間序列的預(yù)測(cè),我們定義了以下輔助繪圖函數(shù):
index = pd.period_range(
start=test_dataset[ts_index][FieldName.START],
periods=len(test_dataset[ts_index][FieldName.TARGET]),
freq=freq,
).to_timestamp()
# Major ticks every half year, minor ticks every month,
ax.xaxis.set_major_locator(mdates.MonthLocator(bymonth=(1, 7)))
ax.xaxis.set_minor_locator(mdates.MonthLocator())
ax.plot(
index[-2*prediction_length:],
test_dataset[ts_index]["target"][-2*prediction_length:],
label="actual",
)
plt.plot(
index[-prediction_length:],
np.median(forecasts[ts_index], axis=0),
label="median",
)
plt.fill_between(
index[-prediction_length:],
forecasts[ts_index].mean(0) - forecasts[ts_index].std(axis=0),
forecasts[ts_index].mean(0) + forecasts[ts_index].std(axis=0),
alpha=0.3,
interpolate=True,
label="+/- 1-std",
)
plt.legend()
plt.show()
正如時(shí)間序列研究人員所知,人們對(duì)“將基于 Transformer 的模型應(yīng)用于時(shí)間序列”問(wèn)題很感興趣。傳統(tǒng) vanilla Transformer 只是眾多基于注意力 (Attention) 的模型之一,因此需要向庫(kù)中補(bǔ)充更多模型。
目前沒(méi)有什么能妨礙我們繼續(xù)探索對(duì)多變量時(shí)間序列進(jìn)行建模,但是為此需要使用多變量分布頭來(lái)實(shí)例化模型。目前已經(jīng)支持了對(duì)角獨(dú)立分布,后續(xù)會(huì)增加其他多元分布支持。請(qǐng)繼續(xù)關(guān)注未來(lái)的博客文章以及其中的教程。
最后,NLP/CV 領(lǐng)域從?大型預(yù)訓(xùn)練模型?中獲益匪淺,但據(jù)我們所知,時(shí)間序列領(lǐng)域并非如此。基于 Transformer 的模型似乎是這一研究方向的必然之選,我們迫不及待地想看看研究人員和從業(yè)者會(huì)發(fā)現(xiàn)哪些突破!
本文章轉(zhuǎn)載微信公眾號(hào)@算法進(jìn)階
對(duì)比大模型API的內(nèi)容創(chuàng)意新穎性、情感共鳴力、商業(yè)轉(zhuǎn)化潛力
一鍵對(duì)比試用API 限時(shí)免費(fèi)