Example: Multivariate forecast¶

This example shows how to use ATOM to work with a multivariate time series dataset with exogenous features.

Import the macroeconomic dataset from sktime.datasets. This is a small dataset that measures US Macroeconomic Data between 1959 and 2009.

Load the data¶

In [1]:

Copied!





# Import packages
import numpy as np
from sktime.datasets import load_macroeconomic
from atom import ATOMForecaster
# Import packages
import numpy as np
from sktime.datasets import load_macroeconomic
from atom import ATOMForecaster

In [2]:

Copied!

# Load the data
X = load_macroeconomic()

print(X)
# Load the data
X = load_macroeconomic()

print(X)

          realgdp  realcons   realinv  realgovt  realdpi      cpi      m1  \
Period                                                                      
1959Q1   2710.349    1707.4   286.898   470.045   1886.9   28.980   139.7   
1959Q2   2778.801    1733.7   310.859   481.301   1919.7   29.150   141.7   
1959Q3   2775.488    1751.8   289.226   491.260   1916.4   29.350   140.5   
1959Q4   2785.204    1753.7   299.356   484.052   1931.3   29.370   140.0   
1960Q1   2847.699    1770.5   331.722   462.199   1955.5   29.540   139.6   
...           ...       ...       ...       ...      ...      ...     ...   
2008Q3  13324.600    9267.7  1990.693   991.551   9838.3  216.889  1474.7   
2008Q4  13141.920    9195.3  1857.661  1007.273   9920.4  212.174  1576.5   
2009Q1  12925.410    9209.2  1558.494   996.287   9926.4  212.671  1592.8   
2009Q2  12901.504    9189.0  1456.678  1023.528  10077.5  214.469  1653.6   
2009Q3  12990.341    9256.0  1486.398  1044.088  10040.6  216.385  1673.9   

        tbilrate  unemp      pop  infl  realint  
Period                                           
1959Q1      2.82    5.8  177.146  0.00     0.00  
1959Q2      3.08    5.1  177.830  2.34     0.74  
1959Q3      3.82    5.3  178.657  2.74     1.09  
1959Q4      4.33    5.6  179.386  0.27     4.06  
1960Q1      3.50    5.2  180.007  2.31     1.19  
...          ...    ...      ...   ...      ...  
2008Q3      1.17    6.0  305.270 -3.16     4.33  
2008Q4      0.12    6.9  305.952 -8.79     8.91  
2009Q1      0.22    8.1  306.547  0.94    -0.71  
2009Q2      0.18    9.2  307.226  3.37    -3.19  
2009Q3      0.12    9.6  308.013  3.56    -3.44  

[203 rows x 12 columns]

Analyze the data¶

In [3]:

Copied!

# We specify the last two columns as our target columns
atom = ATOMForecaster(X, y=(-2, -1), verbose=2, random_state=1)
# We specify the last two columns as our target columns
atom = ATOMForecaster(X, y=(-2, -1), verbose=2, random_state=1)

<< ================== ATOM ================== >>

Configuration ==================== >>
Algorithm task: Multivariate forecast.

Dataset stats ==================== >>
Shape: (203, 12)
Train set size: 163
 --> From: 1959Q1  To: 1999Q3
Test set size: 40
 --> From: 1999Q4  To: 2009Q3
-------------------------------------
Memory: 29.41 kB
Scaled: False
Outlier values: 9 (0.5%)

In [4]:

Copied!

atom.dataset
atom.dataset

Out[4]:

	realgdp	realcons	realinv	realgovt	realdpi	cpi	m1	tbilrate	unemp	pop	infl	realint
Period
1959Q1	2710.349	1707.4	286.898	470.045	1886.9	28.980	139.7	2.82	5.8	177.146	0.00	0.00
1959Q2	2778.801	1733.7	310.859	481.301	1919.7	29.150	141.7	3.08	5.1	177.830	2.34	0.74
1959Q3	2775.488	1751.8	289.226	491.260	1916.4	29.350	140.5	3.82	5.3	178.657	2.74	1.09
1959Q4	2785.204	1753.7	299.356	484.052	1931.3	29.370	140.0	4.33	5.6	179.386	0.27	4.06
1960Q1	2847.699	1770.5	331.722	462.199	1955.5	29.540	139.6	3.50	5.2	180.007	2.31	1.19
...	...	...	...	...	...	...	...	...	...	...	...	...
2008Q3	13324.600	9267.7	1990.693	991.551	9838.3	216.889	1474.7	1.17	6.0	305.270	-3.16	4.33
2008Q4	13141.920	9195.3	1857.661	1007.273	9920.4	212.174	1576.5	0.12	6.9	305.952	-8.79	8.91
2009Q1	12925.410	9209.2	1558.494	996.287	9926.4	212.671	1592.8	0.22	8.1	306.547	0.94	-0.71
2009Q2	12901.504	9189.0	1456.678	1023.528	10077.5	214.469	1653.6	0.18	9.2	307.226	3.37	-3.19
2009Q3	12990.341	9256.0	1486.398	1044.088	10040.6	216.385	1673.9	0.12	9.6	308.013	3.56	-3.44

203 rows × 12 columns

In [5]:

Copied!

# Examine the targets
atom.plot_series()
# Examine the targets
atom.plot_series()

In [6]:

Copied!

atom.plot_decomposition()
atom.plot_decomposition()

Run the pipeline¶

In [7]:

Copied!

# Exogenous features are transformed normally
atom.normalize()
# Exogenous features are transformed normally
atom.normalize()

Fitting Normalizer...
Normalizing features...

In [8]:

Copied!

atom.warnings = True  # Let's turn on warnings for a sec
atom.warnings = True  # Let's turn on warnings for a sec

In [9]:

Copied!

# Use the apply method to transform the target columns
atom.apply(np.sqrt, columns=atom.target)
# Use the apply method to transform the target columns
atom.apply(np.sqrt, columns=atom.target)

Fitting FunctionTransformer...

C:\Users\Mavs\Documents\Python\ATOM\venv311\Lib\site-packages\pandas\core\internals\blocks.py:366: RuntimeWarning:

invalid value encountered in sqrt

In [10]:

Copied!

# Note from the warnings that we might have NaNs in the dataset now
atom.nans
# Note from the warnings that we might have NaNs in the dataset now
atom.nans

Out[10]:

realgdp      0
realcons     0
realinv      0
realgovt     0
realdpi      0
cpi          0
m1           0
tbilrate     0
unemp        0
pop          0
infl         6
realint     52
dtype: int64

In [11]:

Copied!

# And, indeed, we can see them in the target columns
atom.y
# And, indeed, we can see them in the target columns
atom.y

Out[11]:

	infl	realint
Period
1959Q1	0.000000	0.000000
1959Q2	1.529706	0.860233
1959Q3	1.655295	1.044031
1959Q4	0.519615	2.014944
1960Q1	1.519868	1.090871
...	...	...
2008Q3	NaN	2.080865
2008Q4	NaN	2.984962
2009Q1	0.969536	NaN
2009Q2	1.835756	NaN
2009Q3	1.886796	NaN

203 rows × 2 columns

In [12]:

Copied!

# Impute the missing values created by the transformation
atom.impute(strat_num="bfill", columns=atom.target)
# Impute the missing values created by the transformation
atom.impute(strat_num="bfill", columns=atom.target)

Fitting Imputer...
Imputing missing values...
 --> Imputing 6 missing values with bfill in column infl.
 --> Imputing 52 missing values with bfill in column realint.

In [13]:

Copied!

atom.y
atom.y

Out[13]:

	infl	realint
Period
1959Q1	0.000000	0.000000
1959Q2	1.529706	0.860233
1959Q3	1.655295	1.044031
1959Q4	0.519615	2.014944
1960Q1	1.519868	1.090871
...	...	...
2008Q3	0.969536	2.080865
2008Q4	0.969536	2.984962
2009Q1	0.969536	2.984962
2009Q2	1.835756	2.984962
2009Q3	1.886796	2.984962

203 rows × 2 columns

In [14]:

Copied!

atom.run(["BATS", "MSTL"], n_trials=10, warnings=False)
atom.run(["BATS", "MSTL"], n_trials=10, warnings=False)

Training ========================= >>
Models: BATS, MSTL
Metric: mape


Running hyperparameter tuning for BATS...
| trial | use_box_cox | use_trend | use_damped_trend | use_arma_errors |    mape | best_mape | time_trial | time_ht |    state |
| ----- | ----------- | --------- | ---------------- | --------------- | ------- | --------- | ---------- | ------- | -------- |
| 0     |       False |      True |             None |            True | -0.6224 |   -0.6224 |     5.472s |  5.472s | COMPLETE |
| 1     |        None |     False |             True |           False |  -0.638 |   -0.6224 |     0.159s |  5.631s | COMPLETE |
| 2     |        None |      True |            False |           False | -0.9856 |   -0.6224 |     0.322s |  5.954s | COMPLETE |
| 3     |       False |     False |            False |           False |  -0.638 |   -0.6224 |     0.158s |  6.112s | COMPLETE |
| 4     |        None |      True |            False |           False | -0.9856 |   -0.6224 |     0.001s |  6.113s | COMPLETE |
| 5     |       False |     False |            False |           False |  -0.638 |   -0.6224 |     0.002s |  6.115s | COMPLETE |
| 6     |        None |     False |            False |           False |  -0.638 |   -0.6224 |     0.157s |  6.272s | COMPLETE |
| 7     |       False |      True |             None |           False | -0.6224 |   -0.6224 |     1.245s |  7.517s | COMPLETE |
| 8     |        True |      True |             None |            True | -0.6224 |   -0.6224 |     5.364s | 12.881s | COMPLETE |
| 9     |        True |      None |             None |           False | -0.5939 |   -0.5939 |     1.417s | 14.299s | COMPLETE |
Hyperparameter tuning ---------------------------
Best trial --> 9
Best parameters:
 --> use_box_cox: True
 --> use_trend: None
 --> use_damped_trend: None
 --> use_arma_errors: False
Best evaluation --> mape: -0.5939
Time elapsed: 14.299s
Fit ---------------------------------------------
Train evaluation --> mape: -31036189868254.254
Test evaluation --> mape: -0.7341
Time elapsed: 1.168s
-------------------------------------------------
Time: 15.467s


Running hyperparameter tuning for MSTL...
| trial | seasonal_deg | trend_deg | low_pass_deg |  robust |    mape | best_mape | time_trial | time_ht |    state |
| ----- | ------------ | --------- | ------------ | ------- | ------- | --------- | ---------- | ------- | -------- |
| 0     |            1 |         1 |            0 |   False | -0.6357 |   -0.6357 |    20.777s | 20.777s | COMPLETE |
| 1     |            1 |         1 |            1 |   False | -0.6357 |   -0.6357 |     0.095s | 20.872s | COMPLETE |
| 2     |            1 |         1 |            1 |   False | -0.6357 |   -0.6357 |     0.001s | 20.873s | COMPLETE |
| 3     |            1 |         0 |            1 |   False | -0.6357 |   -0.6357 |     0.104s | 20.977s | COMPLETE |
| 4     |            0 |         0 |            1 |   False | -0.6357 |   -0.6357 |     0.119s | 21.096s | COMPLETE |
| 5     |            0 |         1 |            1 |    True | -0.6357 |   -0.6357 |     0.123s | 21.219s | COMPLETE |
| 6     |            0 |         1 |            1 |    True | -0.6357 |   -0.6357 |     0.003s | 21.222s | COMPLETE |
| 7     |            0 |         1 |            1 |    True | -0.6357 |   -0.6357 |     0.002s | 21.224s | COMPLETE |
| 8     |            1 |         0 |            0 |    True | -0.6357 |   -0.6357 |     0.107s | 21.331s | COMPLETE |
| 9     |            1 |         0 |            0 |    True | -0.6357 |   -0.6357 |     0.001s | 21.332s | COMPLETE |
Hyperparameter tuning ---------------------------
Best trial --> 0
Best parameters:
 --> stl_kwargs: {'seasonal_deg': 1, 'trend_deg': 1, 'low_pass_deg': 0, 'robust': False}
Best evaluation --> mape: -0.6357
Time elapsed: 21.332s
Fit ---------------------------------------------
Train evaluation --> mape: -40954763135468.12
Test evaluation --> mape: -0.7308
Time elapsed: 0.107s
-------------------------------------------------
Time: 21.439s


Final results ==================== >>
Total time: 37.028s
-------------------------------------
BATS --> mape: -0.7341
MSTL --> mape: -0.7308 !

Analyze the results¶

In [15]:

Copied!

atom.evaluate()
atom.evaluate()

Out[15]:

	mae	mape	mse	r2	rmse
BATS	-0.553500	-0.734100	-0.505500	-0.012300	-0.693300
MSTL	-0.552600	-0.730800	-0.504800	-0.011100	-0.692900

In [16]:

Copied!

with atom.canvas():
    atom.winner.plot_forecast(target=0)
    atom.winner.plot_forecast(target=1)
with atom.canvas():
    atom.winner.plot_forecast(target=0)
    atom.winner.plot_forecast(target=1)