Integration of external library (Weka forecast tool)

Larocs · June 21, 2021, 3:50pm

Hello all,

I’m a master’s student working on a project of an EMS. Right now I’m working with a power forecasting feature. After searching I found WEKA machine learning software that can help me doing the forecast. (Weka 3 - Data Mining with Open Source Machine Learning Software in Java). It was tested and it works properly, but now I need to integrate it to OpenEMS.

I know that there are some issues with it not having the OSGi headers, but can someone guide me on how to do this process manually? Tha jar file and other information I got them through: https://mvnrepository.com/artifact/nz.ac.waikato.cms.weka/timeseriesForecasting/1.1.27

Also, I saw some licences issues since WEKA is licenced as GPL which might be incompatible with OpenEMS EPL. Does anyone know about this issue and how can I deal with it?

Best regards,
Carlos

Larocs · June 22, 2021, 4:03pm

Hello all,

Here I add more information about my issue and my progress:

I have read this article that shows how to wrap libraries to OSGi bundles:

Therefore I created my .bnd file in my package and it looks like the following:


Bundle-Version: 1.1.27

-classpath: timeseriesForecasting1.1.27.jar

Bundle-SymbolicName: io.openems.edge.forecastermodel

ver: 1.1.27

-output: timeseriesForecasting-1.1.27.jar

Export-Package: *

After that, I try to rebuild the project but it doesn’t work, saying that I have a compilation/path problem and also a “circular dependency error”

I also saw that in some cases it is needed to modify/create a manifest file, but I don’t know how this process is done. Does anyone knows how to do this or any other ways to solve this problem?

Thanks
Regards,
Carlos

stefan.feilmeier · June 23, 2021, 8:29pm

Hi Carlos,

I spent some time on this issue, but not yet with much success. Mainly I am struggling to build a simple self-sufficient example Java application that uses WEKA. I found Weka time series forecasting with pentaho's plugin and their Java API. Here's a full working example to get datetimes along with the forecasted data. · GitHub, but was not able to get it to work quickly.

Could you provide the code for such a simple maven project?

I see that the timeseriesForecasting-1.1.27.jar eventually has a quite long dependency tree, so eventually we will have to build a wrapper like for InfluxDB (openems/influxdb-java.bnd at develop · OpenEMS/openems · GitHub)

Regards,
Stefan

Larocs · June 23, 2021, 9:37pm

Hi Stefan,

I’m using basically the same example, adapted to my data and different parameters, like the algorithm for the prediction. Maybe it is because of the path from the .arff file? (My arff file is in: User>wekafile>packages>sample-data)

Here below my code. Nonetheless, I’ll suggest using the sample data that weak provides downloadable from the package URL link(Waikato Environment for Knowledge Analysis (WEKA)), since mine has been modified and requires some other changes.

package io.openems.edge.forecaster;

    import java.io.*;

    import java.util.List;
    import weka.core.Instances;
    import weka.classifiers.functions.SMOreg;
    import weka.classifiers.evaluation.NumericPrediction;
    import weka.classifiers.timeseries.WekaForecaster;
    //import weka.classifiers.timeseries.core.TSLagMaker;
     
    public class TimeSeriesForecaster {
     
      public static void main(String[] args) {
        try {
          // path 
          String pathToWindData = weka.core.WekaPackageManager.PACKAGES_DIR.toString()
            + File.separator + "timeseriesForecasting" + File.separator + "sample-data"
            + File.separator + "wind.arff";
     
          // load the data
          Instances wind = new Instances(new BufferedReader(new FileReader(pathToWindData)));
     
          // new forecaster
          WekaForecaster forecaster = new WekaForecaster();
     
          // set the targets we want to forecast. This method calls
          // setFieldsToLag() on the lag maker object for us
          forecaster.setFieldsToForecast("capacity (kW)");
     
          // default underlying classifier is SMOreg (SVM)
          forecaster.setBaseForecaster(new SMOreg());
          
          forecaster.getTSLagMaker().setTimeStampField("Date"); // date time stamp
          forecaster.getTSLagMaker().setMinLag(1);
          forecaster.getTSLagMaker().setMaxLag(3); // monthly data
          
          // add a month of the year indicator field
          //forecaster.getTSLagMaker().setAddMonthOfYear(true);
     
          // add a quarter of the year indicator field
          //forecaster.getTSLagMaker().setAddQuarterOfYear(true);
     
          // build the model
          forecaster.buildForecaster(wind, System.out);
     
          // prime the forecaster with enough recent historical data
          // to cover up to the maximum lag.
          forecaster.primeForecaster(wind);
     
          // forecast for 2 days
          List<List<NumericPrediction>> forecast = forecaster.forecast(192, System.out);
     
          // output the predictions. Outer list is over the steps; inner list is over
          // the targets
          for (int i = 0; i < 192; i++) {
            List<NumericPrediction> predsAtStep = forecast.get(i);
            for (int j = 0; j < 1; j++) {
              NumericPrediction predForTarget = predsAtStep.get(j);
              System.out.print("" + predForTarget.predicted() + " ");
            }
            System.out.println();
          }
     
          // we can continue to use the trained forecaster for further forecasting
          // by priming with the most recent historical data (as it becomes available).
          // At some stage it becomes prudent to re-build the model using current
          // historical data.
     
        } catch (Exception ex) {
          ex.printStackTrace();
        }
      }
    }

Another issue that might be stopping it from running is the format of the .arff file. I’ll show below a screenshot of how should the header look like:

Regards,
Carlos

stefan.feilmeier · July 1, 2021, 3:03pm

I spent again a few hours trying to one-by-one integrate the required dependencies into OSGi. This is a tedious task… maybe this work-in-process helps you somehow. I don’t know when I have again time for this.

Larocs · July 1, 2021, 5:54pm

Thanks Stefan,

I’ll check it, and I can do it if you can tell me how exactly is the process to integrate the required dependencies into OSGi. Basically that’s been the reason for me to not do it. Is there a guide/tutorial for it?

stefan.feilmeier · July 2, 2021, 9:01am

What I did was:

Try to ‘Resolve’ EdgeApp.bndrun
Get missing java packages
Add java libraries via pom.xml
Add java libraries to bnd file
Add ‘Bundle Export’ to bnd file
continue with 1…

I am not really an OSGi expert either. It proved to be a very good framework for the dynamic models and dependency injection in OpenEMS; but dependencies are always a problem. It might be best to ask the experts for the best approach: https://bnd.discourse.group/

For the actual deployment in your case, it might be the best to run a separate python executable or a web service (e.g. via a docker container) that does the actual prediction.

I have researched a lot in that direction before and there are just no really good native Java libraries for machine learning out there. Best bet might be Deeplearning4J, but even they struggle to support OSGi: Create OSGi bundles for ND4J by timothyjward · Pull Request #8083 · eclipse/deeplearning4j · GitHub