Creating an Azure ML forecast scenario
This topic describes how to create an Azure ML forecast scenario and all available options.
From the Data model home page, under the "Analytics" tile, click on "Azure ML". Here you can configure your Azure ML forecast scenario once you have met the Azure ML feature requirements and set up an Azure ML Data source connection. Once you create an Azure AutoML scenario, you can create a Procedure step to execute the scenarios under the Analytics Action group.
Creating a new Azure AutoML scenario
To create a new Azure AutoML forecast scenario, proceed as follows:
- Go to the Analytics tile of the desired Data model in the Data model area and then click on the Azure ML tile
- Click on "+FORECAST SCENARIO" in the upper left corner to bring up the configuration panel
- Enter the name of the Scenario in the "Scenario name" field
- Choose an existing Azure ML cloud Data source from the "Data source connection" drop-down list
- Configure the following options under the "DATA" menu:
- (Optional) SELECT. Here you can click on the "SELECT" button to apply a selection. In this case, the forecast analyzes the time series only related to members included in the selection
- Forecast horizon. Here you must enter the number of time periods you want to forecast. For example, if you are doing a monthly forecast, you can enter "1" to forecast only one future month
The type of time period depends on the time Entity in the Structure of the target Cube. The time period can only be Day or Month.
- Observed data. Here you must click on the "LAYOUT" button to configure the Layout that contains the historical data set. Here you should enter the Data Blocks that will be analyzed by the forecasting algorithms, which include the main observed data and the covariates
- Target Cube. Here you must choose the target Cube that will contain the result of the forecast scenario after it is run. The target Cube is usually the one that contains the main observed data
Except for the time Entity, all other Entities in the Structure of the target Cube are used as time series identifiers. While all other Data Blocks in the Layout are used as covariates.
- Observed time range. Here you can click on the "FROM" and "TO" buttons to choose the start and end of the observed time range. The default values for the start and end of the time range are "First loaded period" and "Last loaded period" respectively, which cover the values of the observed data by starting from the oldest to the newest.
- Validation set. Here you must choose one of the following validation options, which determines the set of historical data that will be used as a test to validate the forecast accuracy:
- Kfold cross validation. This type of validation uses the Kfold cross method, which divides the dataset into K equally sized subsets, or folds. The model is then trained and evaluated K times, each time using a different fold as the validation set and the remaining folds as the training set
- Validation percentage. This type of validation uses the indicated percentage size of the historical dataset
- Validation periods. This type of validation uses the indicated periods of the historical dataset
- Choose the preferred competition models that will be used for the training in the field under the "COMPETITION MODELS TRAINING" menu. You can also click on "All" to select all competition models
By selecting "All", Azure ML will also include any custom models created directly in your Azure ML Workspace.
The more models you include, the more time is required to complete the training process.
- Click on "SAVE CHANGES" to save the forecast scenario.
Managing and running Azure ML forecast scenarios
You can perform different actions on one or more existing Azure ML forecast scenarios by selecting them and clicking on the different buttons that appear above the forecast scenario list.
The available actions are described below:
- To delete one or more forecast scenarios, select them and click on the "DELETE" button
- To run a forecast scenario with the training phase, select the desired one and click on "RUN". This will execute the training process and first inference phase, which are done in the following macro steps:
- Connection to the indicated Azure ML workspace and Blob storage
- Serialization of the time series of the Board Layout and creation of the MLTable. The MLTable name will be in the following format: BoardScenarioName (i.e. BoardSalesForecast)
The MLTable is created in the following way:
- An MLTable column is created for each Entity that is in the Structure of the target Cube
- An MLTable column is created to contain the values of the target Cube
- An MLTable column is created for each covariate Data Block - Upload of the historical data, configured in the Board Layout, in the automatically created MLTable Data asset, in the "Data" area of the Azure ML Workspace
- Creation of the Azure ML training experiment and automated ML job, in the "Jobs" area of the Azure ML Workspace. The job name will be in the following format: BoardScenarioNameXP-AAAAMMDDHHMMSS (i.e. BoardSalesForecastXP-20230704100953)
- Execution of the training job by initiating the competition of the selected models based on the forecast scenario configuration
- Creation of an Azure ML Environment that contains the necessary dependencies
- Creation of an Azure ML Endpoint in the "Endpoints" area of your Azure ML Workspace, where the winning model is deployed. The Endpoint name will be in the following format: board-ScenarioName (i.e. board-SalesForecast)
- Execution of the first inference phase on the created Azure ML endpoint
The first inference phase is executed based on the forecast horizon configured in the forecast scenario.
- Retrieval of the inference phase result (prediction) and storage in the target Cube
The training phase in Azure is the process that requires the most time to complete. The time required heavily depends on the size of your historical dataset and the selected competition models.
In the case where you perform a retraining phase, all related resources in Azure ML will be deleted and recreated based on the new forecast scenario configuration.
A pop-up window that shows the progress of this process appears when you click on the "RUN" button. You can click on the "X" button to hide the pop-up window.
However, the whole process explained above is still seamlessly executed in the background and you can check its progress in the Running tasks area.
Once the process is finished, the user will receive a notification near the Top Menu.
You can also check each step of the training process by selecting the forecast scenario and clicking on the "TRAINING LOG" tab in the forecast scenario configuration panel. Click on "RELOAD LOG" to refresh the log details
Board automatically checks the progress of the training phase every 20 seconds in order to notify the user when it is completed.
- To run a forecast scenario without the training phase, select the desired one and click on "RUN INFERENCE ONLY". A pop-up window that shows the progress of this process will appear. You can click on the "X" button to hide the pop-up window, however, the whole process is still seamlessly executed in the background and you can check its progress in the Running tasks area. Once the process is finished, the user will receive a notification near the Top Menu.
The "RUN INFERENCE ONLY" button will be enabled only after the training process has been performed at least once.
Board data is serialized in the same way as in the training process. The MLTable is created in the following way:
- An MLTable column is created for each Entity that is in the Structure of the target Cube
- An MLTable column is for the target Cube
- An MLTable column is created for each covariate Data BlockNew time series cannot be used for inference calls. In case there are new time series, you must perform the training phase again, otherwise the result of the inference call will be zero.
To edit a forecast scenario, select it to bring up the scenario configuration panel and modify the desired settings explained in the paragraph above.
Related Articles:
Azure ML Data source connection