Cannot activate template controller

Hello,
I am trying to implement a simple controller using the bnd controller template (in Eclipse: File > New > Project > Bndtools > Bnd OSGi Project). The creation wizard in Eclipse creates an entire package with a new controller and config file in it. Comparing the created java files to already existing controllers it looks fuctional to me.
I then add my newly generated controller bundle to the EdgeApp.bndrun “runrequires” and “runbundles” sections.
Building the newly created package also works without errors (using ./gradlew build -x test and ./gradlew buildEdge or running the EdgeApp in Eclipse).

Now, when go to the Apache Felix Webconsole to configure and instantiate my bundle, I can see it in the list. The problem is, that when I try to create a new instance and activate it in the Webconsole the bundle never gets activated for some reason and the basic log prints:

[tManager] WARN  [nentmanager.OsgiValidateWorker] [_componentManager] Component(s) configured but not active: ctrlio.openems.edge.controller.valicontroller0

I tried to get more debug log output using an OSGi options file and enabling a lot of logging options with the runproperty osgi.debug=/path/to/osgi-debug.options which gives me the following output in after clicking “save” in the configuration window in the Webconsole:

2020-04-02 18:44:49,759 [re.Cycle] WARN  [ms.edge.core.cycle.CycleWorker] [_cycle] There are no Schedulers configured!
2020-04-02 18:44:50,443 [tManager] WARN  [nentmanager.OsgiValidateWorker] [_componentManager] Component(s) configured but not active: ctrlio.openems.edge.controller.valicontroller0
ModuleClassLoader[org.apache.felix.webconsole_4.3.4].getResource(Controller.io.openems.edge.controller.valicontroller.0893f2c5-ac56-4e3d-bc9d-ed2e2d2800c3)
BundleLoader[org.apache.felix.webconsole_4.3.4].findResource(Controller.io.openems.edge.controller.valicontroller.0893f2c5-ac56-4e3d-bc9d-ed2e2d2800c3)
ModuleClassLoader[org.apache.felix.webconsole_4.3.4].getResource(Controller.io.openems.edge.controller.valicontroller.0893f2c5-ac56-4e3d-bc9d-ed2e2d2800c3) failed.
getServiceReferences(org.osgi.service.cm.ManagedService, "null")
getServiceReferences(org.osgi.service.cm.ManagedServiceFactory, "null")
getServiceReferences(org.osgi.service.cm.ManagedService, "null")
getServiceReferences(org.osgi.service.cm.ManagedServiceFactory, "null")
2020-04-02 18:44:50,759 [re.Cycle] WARN  [ms.edge.core.cycle.CycleWorker] [_cycle] There are no Schedulers configured!

When trying to step through the constructor and activation function of my controller with the debugger in Eclipse, I saw that neither of them get executed and the whole OSGi bundle doesn’t seem to be loaded in the first place.

Does someone have any lead or idea what I am doing wrong here?

Cheers,
Valentin

Hello Valentin,
here is what I do, when this is happening to me. Please open EdgeApp.bndrun in the “Bnd Run File Editor” and switch the view to “Source” un the lower left corner. You should no see the raw file. Under

-runbundles: \

  1. check that your bundle uses the snapshot version:

io.openems.edge.your.bundle;version=snapshot,\

  1. If that does not help, try the latest version:
    io.openems.edge.your.bundle;version=latest,\

  2. If that does not help, take your line and put it between those two lines:
    io.openems.common;version=snapshot,\
    io.openems.edge.application;version=snapshot,\

  3. If that does not help concentrate on
    -runrequires: \

Move your bundle line, which is probably the last one and put it directly above this line:
bnd.identity;id='io.openems.edge.application',\

Don’t forget to append the ",\". I don’t know why, but this sometimes helps me getting a new bundle activated.

  1. If the above does not help please delete the complete directory c:\openems\config

Regards, Chris

1 Like

Also you can try to set a break/watchpoint directly on the class name. Because sometimes before the constructor is called the static variables are already initialized and this could be a problem, if you work with JNI libraries for example. In that case you won’t hit the breakpoint in the constructor.

Hello Chris,
I have tested steps 1,2 and 3 and finally by placing my bundle between common and edge.application in the EdgeApp.bndrun file and using version=latest like so

-runbundles: \
# ...
	io.openems.common;version='[1.0.0,1.0.1)',\
	io.openems.edge.controller.valicontroller;version=latest,\
	io.openems.edge.application;version='[1.0.0,1.0.1)',\
# ...

i got it to activate and now the run() method is finally being executed! (Given a scheduler is configured, ofcourse.)

Vielen Dank Chris!

P.S.: I’m curious what the root cause is here, any idea why this “hackery” is necessary to get the bundle to activate?

Hello Valentin,
nice to hear, that your bundle is working now!

It is unclear where it comes from. I had the same issue, but it only happens sporadically. Other Users do not have this problem. I personally think, that it comes from one of the last “runbundles” which throws an unhandled and/or unlogged exception. Could you please try to move the line
io.openems.edge.controller.valicontroller;version=latest,\
downwards some lines until the error happens again. Maybe by doing that, you could identify the bundle which stops the load and activation process. This would help us finding the problem.

Thanks, Chris

Hi Valentin,

to add to @c.lehne’s answer: it would be very helpful if you could provide the source code (e.g. as a fork of the Github repository) that shows this error. I could then try to clone it and see if I can reproduce and possibly fix the issue.

Regards,
Stefan

Hello Chris and Stefan,

first off, let me thank you for your quick and helpful support!

The repository I’m working on is not publicly available (e.g. GitHub), but currently only TU Vienna internally. That’s why I can’t provide the source directly, but I believe it should still be possible to reproduce this error since I tried to eliminate other variables by using a “fresh” repo clone.

As described in my first post in this topic I simply created a new controller with the creation wizard in Eclipse in a freshly cloned version of openems (commit eefcb965c). See Cannot activate template controller.
Now, I’ve tried Chris’ suggestion to move the “runbudles” line for my controller downwards in EdgeApp.bndrun and I was able to narrow it down to the line org.ops4j.pax.logging.pax-logging-service;version='[1.11.2,1.11.3)',\ (see https://github.com/OpenEMS/openems/blob/eefcb965c6308885ae97f9b57fd639e7a946af5f/io.openems.edge.application/EdgeApp.bndrun#L250). When my controller is referenced before pax-logging-service I can activate it without any apparent errors, but if my controller is referenced after the logging service, it doesn’t get activated.

Cheers,
Valentin

Hi Valentin, hi Stefan,

thanks for your effort, Valentin!
I can also reproduce the error by using the official git repository. I cleaned the openems configuration and reorganized the EdgeApp.bndrun runbundles sequence to this:


org.apache.felix.http.servlet-api;version=‘[1.1.2,1.1.3)’,
org.ops4j.pax.logging.pax-logging-api;version=‘[1.11.2,1.11.3)’,
org.ops4j.pax.logging.pax-logging-service;version=‘[1.11.2,1.11.3)’,
io.openems.edge.controller.debug.log;version=‘[1.0.0,1.0.1)’,
org.apache.felix.scr;version=‘[2.1.10,2.1.11)’,

Then I activated the DebugLog-Bundle in the osgi configuration web interface and put a breakpoint in the
DebugLog:activate() method. Afterwards I started the edge application again and I got the following exception:


2020-04-08 19:56:20,216 [d108040a] DEBUG [.osgi.framework.FrameworkEvent] FrameworkEvent ERROR
! org.osgi.framework.FrameworkEvent[source=org.apache.felix.scr_2.1.10 [125]]
org.osgi.framework.BundleException: Exception in org.apache.felix.scr.impl.Activator.start() of bundle org.apache.felix.scr.
at org.eclipse.osgi.internal.framework.BundleContextImpl.startActivator(BundleContextImpl.java:803)
at org.eclipse.osgi.internal.framework.BundleContextImpl.start(BundleContextImpl.java:732)
at org.eclipse.osgi.internal.framework.EquinoxBundle.startWorker0(EquinoxBundle.java:1005)
at org.eclipse.osgi.internal.framework.EquinoxBundle$EquinoxModule.startWorker(EquinoxBundle.java:357)
at org.eclipse.osgi.container.Module.doStart(Module.java:584)
at org.eclipse.osgi.container.Module.start(Module.java:452)

Caused by: java.lang.AbstractMethodError: org.ops4j.pax.logging.service.internal.PaxLoggingServiceImpl$1ManagedPaxLoggingService.getLogger(Ljava/lang/String;Ljava/lang/Class;)Lorg/osgi/service/log/Logger;
at org.apache.felix.scr.impl.logger.R7LogServiceLogger.(R7LogServiceLogger.java:35)

at org.apache.felix.scr.impl.config.ScrConfigurationImpl.start(ScrConfigurationImpl.java:118)
at org.apache.felix.scr.impl.Activator.start(Activator.java:100)
at org.eclipse.osgi.internal.framework.BundleContextImpl$3.run(BundleContextImpl.java:782)
at org.eclipse.osgi.internal.framework.BundleContextImpl$3.run(BundleContextImpl.java:1)
at java.security.AccessController.doPrivileged(Native Method)
at org.eclipse.osgi.internal.framework.BundleContextImpl.startActivator(BundleContextImpl.java:775)
… 12 more
2020-04-08 19:56:20,224 [d108040a] DEBUG [rg.osgi.framework.ServiceEvent] ServiceEvent REGISTERED - {org.apache.felix.webconsole.ConfigurationPrinter}={service.id=41, service.bundleid=126, service.scope=singleton}

Note also, that the breakpoint was not reached.

(*)
Then I moved the line org.apache.felix.scr;version='[2.1.10,2.1.11)',\
directly above the line org.apache.felix.http.servlet-api;version='[1.1.2,1.1.3)',\

Starting again, I do not see any exception. Also I still do not reach my breakpoint.

So we have three issues here:

  1. Why is the activation process being stopped when an exception is thrown?
  2. Why do we get an exception from the pax logging service?
  3. Why do we not see the exception in the logfile, when using configuration setting (*)?

@Stefan:
It seems that the Pax Logging Service is used inside org.apache.felix.scr. I have no experience with the felix activation process or with the Pax Logging Service. I was unable to debug org.apache.felix.scr. Do you have any helpful idea?

Thanks for digging into this problem. I like OSGi, but logging is really really annoying. I already spent too many hours trying to fix those issues…

I’ll try to find some time in the next days to look into this. Thanks again!

Update: I was able to reproduce the problem and created a stripped down, simple example that shows the same behaviour. Will continue from here…

I was able to fix the problem in https://github.com/OpenEMS/openems/pull/1104.

After I was able to reproduce the problem as shown by Christian, I googled for the exception and found a ticket in Apache Karaf (https://issues.apache.org/jira/browse/KARAF-6259?focusedCommentId=16843468&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16843468) which led to the solution: for OSGi v7 it is required to update PAX logging to v2.

I had seen that before, but eventually after a couple more hours at the PAX Github page (https://github.com/ops4j/org.ops4j.pax.logging) that the had renamed “pax-logging-service” to “pax-logging-log4j” with the update to v2. (“pax-logging-log4j1 was named pax-logging-service before version 2.0.0”). This is why I had never found a v2 for pax-logging-service.

Thanks for bringing this up and helping to find a solution!

1 Like