Here are some additional Information regading the CAN bundle:
There was a change in the CAN socket structure between linux V4 and V5. The linux socket C-Structure was changed in size and meaning. Therefore we were forced to compile different JNI libraries for both kernel versions. Luckily OpenEMS just loads a JNI library when a component is activated. Thus the easiest way to handle the different JNI libraries was it to have two different OpenEMS bundles. It may be possible to use only one JNI library, but it looks really complicated to me to compile one libary against two different sets of kernel headerfiles.)
general thought: I am unsure if it needs a JNI library at all. It is called CAN Socket API, therefore it should be doable to control the CAN interface directly from within Java Sockets (but still there is the problem with differen CAN Socket APIs on different systems).
this library provides mechanisms to directly write CAN messages to the CAN bus. Due to the concept of OpenEMS, you may reach CAN cycle times of down to 500ms with that. But note that due to Java Garbage Collection you may have a jitter of several hundred ms. Because in one case we were forced to have a cycle time of 100ms with almost no jitter, we introduced the concept of a JNI low level CAN send task. With that your special device OpenEMS CAN driver bundle decides, if CAN frames are send directly from Java or if CAN frames are send from a low level c thread and your OpenEMS driver bundle just modifies the low level data. That means that you can have a jitter in your data preparation on the java side, but the CAN frames are always send approx. every 100ms.
Note that the combination of OSGI, OpenEMS with this OpenEMS CAN bundle is not the way to go, if you want high speed CAN access. On a Kunbus you may be able to send/receive 20 CAN frames/s easily. Nevertheless, it can become a huge challenge to speed up your CAN load more.
It’s nice to see these exciting new developments! I’m still reviewing the code, so let me leave this disclaimer here in case I’m missing something. In the following, let me summarize the code and discuss the points mentioned by @c.lehne.
As mentioned before, the module adds support for the CAN bus over the SocketCAN driver. The hardware support is now dedicated to Kunbus devices, but due to the portability of the drivers, it can also be transferred to other devices. The module also contains code for the cyclic sending of CAN frames.
The module implements a single-threaded read/writer model and exposes a similar interface to the modbus bridge. Testing functionality is also included into the pull request.
I like the overall structure of the code; it is similar to my developments, and the implementation details fit well into OpenEMS. As mentioned before, the interface of the module is similar to the “Modbus Bridge” and is therefore easy to understand.
Is it your requirement to support these old kernels? Otherwise I’d focus only on kernels from version 5.1x, since they provide newer features.
As far as I know, the Java Socket API only binds to network sockets. Binding to Socket CAN require opening a raw socket, and therefore it can’t be done from Java. However, I do not really like the idea of having compiled JNI binaries in the source tree. The JNA library offers a method of calling native functions without the need for shared libraries. The new GPIO library I am working on will also be based on JNA.
JNA would improve many points compared to JNI:
Transferability: JNA bindings are only Java code. Transferring from Linux SocketCAN to Windows would be possible without C/C++ skills.
Testability: Mocking JNA interfaces is easier since they are written in Java.
If I understand you correctly, you refer to low- and fixed-latency communication. A feasible workaround, which you also describe, is to write an application-specific communication agent that cares about low-latency communication. I think your current implementation covers 90% of all use cases, and this issue is outside the scope of this discussion and implementation.
As a final word, I like the overall implementation, but would change it to a generic Linux driver and do the native function calls with JNA in order to avoid compiled native libraries.
Thanks @hydroid7 for your in depth analysis of the CAN bridge code!
Unfortunately yes. The Kunbus Connect+ is delivered with kernel V4 only. And we have a large number of systems running this hardware.
You are absolutely right. Doing Java and adding JNI Code is not that best way to go. I do not like the way this is solved right now. Especially as we want to support Consolinno Leaflet with CAN soon. Something that would probably mean a third JNI library just for the CAN bridge.
I did not know JNA before. But I very much like the idea of using it. Will add this to our todo list. But this is probably nothing we can do in the coming weeks. I would suggest, that we switch to JNA when we add support for Consolinno Leaflet, which is on the roadmap for Q4-2023.
When I started coding, the modbus bridge architecture was a tough nut to crack for me. So, again, thanks for your time and your in depth analysis!
@c.lehne and @hydroid7: Thank you both for this technical discussion. What is the conclusion for now? I agree that switching to JNA would be an improvement. Also there were big changes to the Modbus Bridge recently, which should maybe considered (i.e. reused) here as well. Nevertheless I believe it would be good to finally merge a CAN bridge to OpenEMS and then continuously improve the existing code base.
Would it be a feasible approach to just merge this implementation or are there any major refactorings explicitely required before? (e.g. change number/name of bundles, change API interface for defining a protocol, etc.)
For now, I think this implementation could be merged if the following comments are adressed:
It would be nice to merge the two binaries libsocket-can-java... in order to have a generic linux version and a corresponding package io.openems.edge.bridge.can.socketcan.
The packages io.openems.edge.bridge.can.linuxv4 and io.openems.edge.bridge.can.linuxv5 are literally the same. Is there really a need to have two different packages? One could do a platform check in java and load the corresponding library.
Setting baud rate is not implemented in the drivers for SocketCAN v4 and SocketCAN v5.
Kunbus should be removed and replaced with generic SocketCAN v4 and SocketCAN v5 from the code.
SocketCAN driver should not have a simulator mode. Integration tests should be possible using Linux vcan and unit tests with the plain Java implementation.
Removing not implemented features and warings.
Besides that here is a nice to have wishlist in priority order:
Explaining cyclic sending more, e. g. min cycle time, max cycle time, jitter. Unfortunately it is hard to reason about them, since the source is not available.
Think about possible other drivers: Windows, CAN over TCP/IP or UDP, CAN over SPI (for embedded boards).
Adding different CAN message queues, with and without persistence. For example, for some IDs only the last message is interesting. Old messages can be thrown away. For other values, maybe all changes are relevant.
Reading CAN database files. Such DBC files describe the CAN protocol of the device. Reading them allows automatic message mapping.
@c.lehne I can help you adressing these points, especially if you can grant me access to the native code.
the link to the shared library and build instructions is documented within PR 2288 → readme.adoc.
Ups, I found a copy and paste error within the readme.adoc. The library is libsocket-can-java not the documented librevpi-dio-java. Will change this within the PR soon. Between the two kernel versions there was an enhancement within a CAN socket structure (I am on holiday right now. I don’t remember which structure it was). The enhancement leads to a different size of the structure.
Due to the different structure sizes I would not suggest using a V4 library on a V5 kernel, as the kernel may access memory, which was not allocated by the v4 library.
Using a V5 library on a V4 kernel may work. Kernel operations try to avoid copying complete structures, but from time to time the kernel needs to copy a structure, which in our case may lead to corrupt data within the structure.
Technically It should be possible to build one library for both kernels. But this must be done careful. And I did not had the time to do so.
Will answer your previous post, when I am back in office.
With UI do you mean the OpenEMS-UI? The CAN bridge does not have an associated UI widget (it is analog to the modbus bridge).
But I assume you mean the Felix Configuration UI. That is always not easy to say. There are a lot of reasons for the component not showing up. Only a guess, but have you added the shared library to the bundle?
The lib folder is within the .gitignore list. So the library within the directory must be explicitly added with git add.
If that is not the reason you may have a look for other reasons. You may seach the forum for bundle not activated
I’ve tried to build your library. Unfortunately, some methods are missing from the source code.
Can you please take a look why these methods are missing? Maybe your version is not pushed to the public repository?
Here is an overview about missing methods:
- public void sendCyclicallyAdd(org.clehne.revpi.canbus.CanSocket$CanFrame, int) throws java.io.IOException;
- public void sendCyclicallyRemove(org.clehne.revpi.canbus.CanSocket$CanFrame) throws java.io.IOException;
- public void sendCyclicallyAdopt(org.clehne.revpi.canbus.CanSocket$CanFrame) throws java.io.IOException;
- public void removeCyclicalAll() throws java.io.IOException;
- public void enableCyclicallyAutoIncrement(int, int) throws java.io.IOException;
- public int statsGetCanFrameErrorCntrCyclicalSend() throws java.io.IOException;
- public int statsGetCanFrameErrorCntrSend() throws java.io.IOException;
- public int statsGetCanFrameErrorCntrReceive() throws java.io.IOException;
- public int statsGetCanFrameFramesSendPerCycle() throws java.io.IOException;
We could also implement here the platform detection for the different Linux kernels.
thank you for your effort. Great to hear that you start working on a better integration.
Regarding your missing files, to build this libraries is not self explanatory. I added two readme.adoc files, one for the CAN library V4 and one for the CAN library V5. I checked the PR just now and I saw that there is a wrong reference given. The readme should be adopted to:
The CAN library for Kernel V4 is build using the branch V2023.02
good news. I’ve tested the new module with Linux virtual CAN interface an it looks good for kernel >= 5.10
@c.lehne what was your intentions with “setting the baudrate”? As far as I know there is no method to set the baudrate from userspace except you grant yourself CAP_NET_ADMIN which also includes various network device configuration capabilities, like enabling promiscous mode . If there is no point setting the baudrate, we can remove it from the bundle completely.