DeathReaper0965 commited on
Commit
db6348e
·
verified ·
1 Parent(s): 301162b
Files changed (1) hide show
  1. README.md +118 -114
README.md CHANGED
@@ -1,114 +1,118 @@
1
- ---
2
- license: apache-2.0
3
- ---
4
- # Cisco Time Series Model
5
- The Cisco Time Series Model is a foundation model trained to perform univariate zero-shot forecasting. Its core is a sequence of decoder-only transformer layers. It is heavily based on the [TimesFM2.0 model](https://huggingface.co/google/timesfm-2.0-500m-pytorch), with multiresolution modifications aimed at efficient use of long context. It expects a multiresolution context (x<sub>c</sub>, x<sub>f</sub>), where the resolution (i.e., space between data points) of x<sub>c</sub> is 60 times the resolution of x<sub>f</sub>. Both x<sub>c</sub> and x<sub>f</sub> can have length up to 512. The input contexts should be aligned “on the right,” e.g., if x<sub>f</sub> consists of the 512 minutes terminating at 11:00AM on November 11, then x<sub>c</sub> should consist of the 512 hours terminating at the same time. The output is a forecast of 128 points, which should be interpreted at the finer resolution; and corresponding quantiles for these points.
6
-
7
- For convenience, we provide utilities for preparing a multiresolution context from a single resolution context (with length up to 512 x 60 = 30,720) directly.
8
-
9
- ## Model Architecture and Training Details
10
- <figure>
11
- <img src="images/mr_model_architecture.png" alt="Multiresolution model architecture">
12
- <figcaption><em>Architecture diagram illustrating our novel additions of Resolution Embeddings and Special Token.</em></figcaption>
13
- </figure>
14
-
15
- Despite not conforming to the TimesFM architecture, the pre-training of the Cisco Time Series Model began from the weights of TimesFM. The dataset used for the additional training contains over 300B unique datapoints. Slightly more than 50% of the data is derived from metric time series data from internal deployments of the Splunk Observability Cloud, with about 35% at (1-hour, 1-minute) resolution, and the remaining 15% at (5-hour, 5-minute) resolution. Additional multiresolution data, comprising about 30% of the training set, was derived from the [GIFT-Eval](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain) pretraining corpus. Another 5% was derived from the [Chronos](https://huggingface.co/datasets/autogluon/chronos_datasets) dataset collection (less overlap with GIFT-Eval test). The final 15% is synthetic multiresolution data.
16
-
17
- **Note:** A PyTorch implementation of the model architecture can be found in our [GitHub repository](https://github.com/splunk/cisco-time-series-model). A more detailed technical report will be released on arXiv soon; you can also access it [here](https://github.com/splunk/cisco-time-series-model/blob/main/1.0-preview/technical_report/Cisco-Time-Series-Model-Techincal-Report.pdf).
18
-
19
- ### Example Visualization of Multiresolution Time Series Input to the Model
20
- <figure>
21
- <img src="images/multi_resolution_time_series_example.png" alt="Multiresolution time series example with padded 1-hour context">
22
- <figcaption><em>Multiresolution time series example with padded 1-hour context.</em></figcaption>
23
- </figure>
24
-
25
- ## Usage notes
26
- - If the input time series is missing some values, imputation via last value is recommended; if the time series is naturally sparse and this leads to excessive imputation (e.g., more than 30% of values are imputed), the model forecasts will deteriorate.
27
- - The model generally works better when more coarse resolution history is provided. Its performance may suffer on very short inputs.
28
- - The quantiles have not been calibrated or rigorously evaluated, e.g., we currently do not have evidence to support a claim along the lines of “the range from q=0.1 to q=0.9 contains the true value 80% of the time (under some mild conditions).”
29
-
30
- ## Checkpoint
31
- We currently provide one open checkpoint, [cisco-time-series-model-1.0-preview](https://huggingface.co/cisco-ai/cisco-time-series-model-1.0-preview).
32
-
33
- ## Minimal Installation Instructions
34
- Clone the repository:
35
- ```shell
36
- git clone https://github.com/splunk/cisco-time-series-model.git
37
- cd cisco-time-series-model
38
- pip install -r requirements.txt
39
- ```
40
-
41
- For more detailed instructions and virtual environment setup, please refer to the [GitHub repository](https://github.com/splunk/cisco-time-series-model).
42
-
43
- ## Example Usage
44
- ```python
45
- import torch
46
- import numpy as np
47
- from modeling import CiscoTsmMR, TimesFmHparams, TimesFmCheckpoint
48
-
49
- rng = np.random.default_rng(42)
50
-
51
- ## Sample data
52
- T = 512 * 60
53
- hours = (T + 59) // 60
54
- k = np.arange(hours, dtype=np.float32)
55
- h = (80 + 0.1 * k) * (1 + 0.25 * np.sin(2 * np.pi * k / 24))
56
- t = np.arange(T, dtype=np.float32)
57
-
58
- input_series = h[(t // 60).astype(int)] * (1 + 0.05 * np.sin(2 * np.pi * t / 30)) + rng.normal(0, 0.4, size=T)
59
-
60
- # Hyperparameters
61
- hparams = TimesFmHparams(
62
- num_layers=50,
63
- use_positional_embedding=False,
64
- backend="gpu" if torch.cuda.is_available() else "cpu",
65
- )
66
-
67
- ckpt = TimesFmCheckpoint(huggingface_repo_id="cisco-ai/cisco-time-series-model-1.0-preview")
68
-
69
- model = CiscoTsmMR(
70
- hparams=hparams,
71
- checkpoint=ckpt,
72
- use_resolution_embeddings=True,
73
- use_special_token=True,
74
- )
75
-
76
- # Model Inference
77
- forecast_preds = model.forecast(input_series, horizon_len=128)
78
-
79
- # Access forecast mean and quantiles of each series
80
- mean_forecast = forecast_preds[0]['mean'] # (128,)
81
- quantiles = forecast_preds[0]['quantiles'] # dict with keys as quantile levels (0.1, 0.2, ...., 0.9) and values as (128,) numpy arrays
82
-
83
- # You can also forecast multiple series at once
84
- T = 25_000
85
- hours = (T + 59) // 60
86
- k = np.arange(hours, dtype=np.float32)
87
- h = 120 / (1 + np.exp(-0.01 * (k - 300))) + 10 * np.cos(2 * np.pi * k / (24*7))
88
- t = np.arange(T, dtype=np.float32)
89
- input_series_2 = h[(t // 60).astype(int)] + 2 * np.sin(2 * np.pi * t / 60) + rng.normal(0, 0.5, size=T)
90
-
91
- multi_series_forecasts = model.forecast([input_series_1, input_series_2], horizon_len=128)
92
-
93
- # Long horizon forecasting is also supported and can be invoked as follows
94
- long_horizon_forecasts = model.forecast(input_series_1, horizon_len=240)
95
-
96
- ```
97
-
98
- <b>Authored by:</b>
99
- - Liang Gou \*
100
- - Archit Khare \*
101
- - Praneet Pabolu \*
102
- - Prachi Patel \*
103
- - Joseph Ross \*
104
- - Hercy Shen \*‡
105
- - Yuhan (Ellen) Song \*
106
- - Jingze Sun \*
107
- - Kristal Curtis
108
- - Vedant Dharnidharka
109
- - Abhinav Mathur
110
- - Hao Yang
111
-
112
- \* These authors contributed equally to the core development of this work, listed alphabetically by last name. <br>
113
- These authors contributed equally to supporting and extending this work, listed alphabetically by last name. <br>
114
- Hercy Shen contributed to this work while an intern at Splunk.<br>
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ datasets:
4
+ - Salesforce/GiftEvalPretrain
5
+ - autogluon/chronos_datasets
6
+ pipeline_tag: time-series-forecasting
7
+ ---
8
+ # Cisco Time Series Model
9
+ The Cisco Time Series Model is a foundation model trained to perform univariate zero-shot forecasting. Its core is a sequence of decoder-only transformer layers. It is heavily based on the [TimesFM2.0 model](https://huggingface.co/google/timesfm-2.0-500m-pytorch), with multiresolution modifications aimed at efficient use of long context. It expects a multiresolution context (x<sub>c</sub>, x<sub>f</sub>), where the resolution (i.e., space between data points) of x<sub>c</sub> is 60 times the resolution of x<sub>f</sub>. Both x<sub>c</sub> and x<sub>f</sub> can have length up to 512. The input contexts should be aligned “on the right,” e.g., if x<sub>f</sub> consists of the 512 minutes terminating at 11:00AM on November 11, then x<sub>c</sub> should consist of the 512 hours terminating at the same time. The output is a forecast of 128 points, which should be interpreted at the finer resolution; and corresponding quantiles for these points.
10
+
11
+ For convenience, we provide utilities for preparing a multiresolution context from a single resolution context (with length up to 512 x 60 = 30,720) directly.
12
+
13
+ ## Model Architecture and Training Details
14
+ <figure>
15
+ <img src="images/mr_model_architecture.png" alt="Multiresolution model architecture">
16
+ <figcaption><em>Architecture diagram illustrating our novel additions of Resolution Embeddings and Special Token.</em></figcaption>
17
+ </figure>
18
+
19
+ Despite not conforming to the TimesFM architecture, the pre-training of the Cisco Time Series Model began from the weights of TimesFM. The dataset used for the additional training contains over 300B unique datapoints. Slightly more than 50% of the data is derived from metric time series data from internal deployments of the Splunk Observability Cloud, with about 35% at (1-hour, 1-minute) resolution, and the remaining 15% at (5-hour, 5-minute) resolution. Additional multiresolution data, comprising about 30% of the training set, was derived from the [GIFT-Eval](https://huggingface.co/datasets/Salesforce/GiftEvalPretrain) pretraining corpus. Another 5% was derived from the [Chronos](https://huggingface.co/datasets/autogluon/chronos_datasets) dataset collection (less overlap with GIFT-Eval test). The final 15% is synthetic multiresolution data.
20
+
21
+ **Note:** A PyTorch implementation of the model architecture can be found in our [GitHub repository](https://github.com/splunk/cisco-time-series-model). A more detailed technical report will be released on arXiv soon; you can also access it [here](https://github.com/splunk/cisco-time-series-model/blob/main/1.0-preview/technical_report/Cisco-Time-Series-Model-Technical-Report.pdf).
22
+
23
+ ### Example Visualization of Multiresolution Time Series Input to the Model
24
+ <figure>
25
+ <img src="images/multi_resolution_time_series_example.png" alt="Multiresolution time series example with padded 1-hour context">
26
+ <figcaption><em>Multiresolution time series example with padded 1-hour context.</em></figcaption>
27
+ </figure>
28
+
29
+ ## Usage notes
30
+ - If the input time series is missing some values, imputation via last value is recommended; if the time series is naturally sparse and this leads to excessive imputation (e.g., more than 30% of values are imputed), the model forecasts will deteriorate.
31
+ - The model generally works better when more coarse resolution history is provided. Its performance may suffer on very short inputs.
32
+ - The quantiles have not been calibrated or rigorously evaluated, e.g., we currently do not have evidence to support a claim along the lines of “the range from q=0.1 to q=0.9 contains the true value 80% of the time (under some mild conditions).”
33
+
34
+ ## Checkpoint
35
+ We currently provide one open checkpoint, [cisco-time-series-model-1.0-preview](https://huggingface.co/cisco-ai/cisco-time-series-model-1.0-preview).
36
+
37
+ ## Minimal Installation Instructions
38
+ Clone the repository:
39
+ ```shell
40
+ git clone https://github.com/splunk/cisco-time-series-model.git
41
+ cd cisco-time-series-model
42
+ pip install -r requirements.txt
43
+ ```
44
+
45
+ For more detailed instructions and virtual environment setup, please refer to the [GitHub repository](https://github.com/splunk/cisco-time-series-model).
46
+
47
+ ## Example Usage
48
+ ```python
49
+ import torch
50
+ import numpy as np
51
+ from modeling import CiscoTsmMR, TimesFmHparams, TimesFmCheckpoint
52
+
53
+ rng = np.random.default_rng(42)
54
+
55
+ ## Sample data
56
+ T = 512 * 60
57
+ hours = (T + 59) // 60
58
+ k = np.arange(hours, dtype=np.float32)
59
+ h = (80 + 0.1 * k) * (1 + 0.25 * np.sin(2 * np.pi * k / 24))
60
+ t = np.arange(T, dtype=np.float32)
61
+
62
+ input_series = h[(t // 60).astype(int)] * (1 + 0.05 * np.sin(2 * np.pi * t / 30)) + rng.normal(0, 0.4, size=T)
63
+
64
+ # Hyperparameters
65
+ hparams = TimesFmHparams(
66
+ num_layers=50,
67
+ use_positional_embedding=False,
68
+ backend="gpu" if torch.cuda.is_available() else "cpu",
69
+ )
70
+
71
+ ckpt = TimesFmCheckpoint(huggingface_repo_id="cisco-ai/cisco-time-series-model-1.0-preview")
72
+
73
+ model = CiscoTsmMR(
74
+ hparams=hparams,
75
+ checkpoint=ckpt,
76
+ use_resolution_embeddings=True,
77
+ use_special_token=True,
78
+ )
79
+
80
+ # Model Inference
81
+ forecast_preds = model.forecast(input_series, horizon_len=128)
82
+
83
+ # Access forecast mean and quantiles of each series
84
+ mean_forecast = forecast_preds[0]['mean'] # (128,)
85
+ quantiles = forecast_preds[0]['quantiles'] # dict with keys as quantile levels (0.1, 0.2, ...., 0.9) and values as (128,) numpy arrays
86
+
87
+ # You can also forecast multiple series at once
88
+ T = 25_000
89
+ hours = (T + 59) // 60
90
+ k = np.arange(hours, dtype=np.float32)
91
+ h = 120 / (1 + np.exp(-0.01 * (k - 300))) + 10 * np.cos(2 * np.pi * k / (24*7))
92
+ t = np.arange(T, dtype=np.float32)
93
+ input_series_2 = h[(t // 60).astype(int)] + 2 * np.sin(2 * np.pi * t / 60) + rng.normal(0, 0.5, size=T)
94
+
95
+ multi_series_forecasts = model.forecast([input_series_1, input_series_2], horizon_len=128)
96
+
97
+ # Long horizon forecasting is also supported and can be invoked as follows
98
+ long_horizon_forecasts = model.forecast(input_series_1, horizon_len=240)
99
+
100
+ ```
101
+
102
+ <b>Authored by:</b>
103
+ - Liang Gou \*
104
+ - Archit Khare \*
105
+ - Praneet Pabolu \*
106
+ - Prachi Patel \*
107
+ - Joseph Ross \*
108
+ - Hercy Shen \*‡
109
+ - Yuhan (Ellen) Song \*
110
+ - Jingze Sun \*
111
+ - Kristal Curtis †
112
+ - Vedant Dharnidharka
113
+ - Abhinav Mathur
114
+ - Hao Yang
115
+
116
+ \* These authors contributed equally to the core development of this work, listed alphabetically by last name. <br>
117
+ † These authors contributed equally to supporting and extending this work, listed alphabetically by last name. <br>
118
+ ‡ Hercy Shen contributed to this work while an intern at Splunk.<br>