question

pcurtis avatar image
pcurtis asked

Venus OS large periodically becomes unresponsive and needs restart when using Tethered connection

I have recently started having a problem that the system has become unresponsive 4 times over the last three weeks. It had a Raspberry Pi 3B+ running Venus OS large v2.80~33-large-24 during this period . My system is on a narrowboat and has a SmartSolar MMPT 100/20, SmartShunt and Phoenix Smart Inverter 24/1600 and the Wifi and internet access is provided by a dedicated tethered Android 4g phone. All the incidents were very similar in that:

  • The VRM portal stopped updating and all access was lost to node-RED
  • SSH Access was lost to the Pi on the network whilst other devices connected and worked including internet access
  • Victron Connect could see the Raspberry Pi using Bluetooth but not connect
  • Wifi in the Raspberry still seemed to be connected to the local network but even a ping did not get a response.
  • It may be a coincidence but I previously had several months operation without a single access problem with Venus OS 2.73 or 2.80~21-large-23

Initially I thought the OS was completely dead but I found some log files in /data/log and there are major changes at the time the portal updating stopped in the messages log files but it is not obvious to me what has happened. There are no actual errors showing and the kernel still seems to be running. I am not sure where to look for further information and am not familiar with the log file structure in the Venus OS.

I have attached the last two log files, (I had to add the .txt before it would accept them). The last portal update was at 16:01 on 20/01/2022 which is in messages.0 at about line 991. I also have a clone of the SD taken on a separate machine before I restarted the Raspberry Pi 3B +

The above was posted as a potential problem with the Venus OS large v2.80~33-large-24 here as I previously had several months operation without a single access problem with Venus OS 2.73 and then 2.80~21-large-23.

@mvader (Victron Energy) pointed out that the log file showed WiFi issues. For example:

Jan 16 13:13:48 raspberrypi2 daemon.notice wpa_supplicant[742]: wlan0: Trying to associate with SSID 'Big Blue'
Jan 16 13:13:51 raspberrypi2 daemon.notice wpa_supplicant[742]: wlan0: CTRL-EVENT-ASSOC-REJECT bssid=00:00:00:00:00:00 status_code=16
Jan 16 13:13:51 raspberrypi2 daemon.notice wpa_supplicant[742]: wlan0: CTRL-EVENT-SSID-TEMP-DISABLED id=1 ssid="Big Blue" auth_failures=1 duration=10 reason=CONN_FAILED
Jan 16 13:13:51 raspberrypi2 daemon.warn connmand[673]: Skipping disconnect of 42696720426c7565_managed_psk, network is connecting.
Jan 16 13:14:21 raspberrypi2 daemon.info connmand[673]: Connection Manager version 1.33

shows a connection failure followed by a restart of the connection manager (connman) and had nothing to do with Venus OS large in itself. He suggested this should be moved to the community area for further discussion.

messages.0.txt

messages.txt

Venus OS
messages.txt (92.6 KiB)
messages0.txt (100.1 KiB)
12 comments
2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

I'm not a network expert, but first thing I would try is a different internet device (router, another phone etc.) and hook up your GX device to that. I can't recall where I saw it, but it's not the first time I read that a tethered Android connection is making problems.

From your log:

  1. Jan 16 13:13:51 raspberrypi2 daemon.notice wpa_supplicant[742]: wlan0: CTRL-EVENT-SSID-TEMP-DISABLED id=1 ssid="Big Blue" auth_failures=1 duration=10 reason=CONN_FAILED
0 Likes 0 ·
mvader (Victron Energy) avatar image mvader (Victron Energy) ♦♦ Stefanie (Victron Energy Staff) ♦♦ commented ·

Indeed, and preferably use lan cable rather than wifi

0 Likes 0 ·
pcurtis avatar image pcurtis Stefanie (Victron Energy Staff) ♦♦ commented ·

@Stefanie Thanks for heads up on tethering problems. I have done search on tether in community and there are some references to problems but not much in common between them. I have also had weeks without issues before recent problems.

There are also mentions of powersaving modes being activated causing similar problems.

I hope I can find a way to do an automatic clean restart of WiFi without a full reboot although I do not understand why a WiFi problem has apparently locked up Victron services and Node-RED.

0 Likes 0 ·
mvader (Victron Energy) avatar image mvader (Victron Energy) ♦♦ pcurtis commented ·
Hi again, my advice: tethering from a phone is not designed to be reliable. I would simply skip that, get something serious (like the rutronik) and spend your time elsewhere.


With regards to the services locking up: I'm not convinced they lock up. I haven't seen any proof of that. The only thing I see in the logs are network issues.

My suspicion is they don't lock up; so still running; but unreachable because of network issues.

And with the combination being a rpi (so not a official Victron product) + wifi + tethering from a phone + only one such complaint, I have to disappoint you: that is not something we'll be looking into from Victron side.

0 Likes 0 ·
pcurtis avatar image pcurtis mvader (Victron Energy) ♦♦ commented ·
@mvader (Victron Energy) My earlier reply seems to have gone missing or to somewhere unknown so i am repeating with some additional information!

I am extremely grateful for your responses but never expected any official support from Victron.

I have finally got the system to become unresponsive again after 5 days with much better diagnostics in place and you are correct that the basic Venus system is running and storing data and my checks indicate Node-RED is also updating persistent global context data.

I have changed the title to include tethering and will eventually add an answer summarising the various comments which have identified and offered ways forwards and intend it to act as a warning to those thinking of using tethering.

There are still some unanswered questions in my mind so I will also explore a little more deeply.

0 Likes 0 ·
mvader (Victron Energy) avatar image mvader (Victron Energy) ♦♦ commented ·
Hi @pcurtis , thanks for moving it here. You write LAN, but I thought the GX was connected using WiFi?

To distinguish between a network issue and anything else, I recommend connecting either a screen + keyboard or a serial console cable.

0 Likes 0 ·
pcurtis avatar image pcurtis mvader (Victron Energy) ♦♦ commented ·
LAN corrected. Screen and keyboard not easy as system is on a boat which is unoccupied and problem only occurs every few days. I have to go to it for many tests. I have added an automatic reboot if node-RED stops which should help distinguish network and general problems. I will swap phones at some point.
0 Likes 0 ·
mvader (Victron Energy) avatar image mvader (Victron Energy) ♦♦ pcurtis commented ·
Try and make it forget a few wifi networks. Who knows it might solve your issue.


And if not that then at least it cleans up the system logs.

0 Likes 0 ·
pcurtis avatar image pcurtis mvader (Victron Energy) ♦♦ commented ·

@mvader (Victron Energy) It has only been connected to two networks both on tethered phones as it went straight to Corinna and 99% of my development has been via VRM, it might have also seen the home network but I think that was via ethernet cable during initial set up. I will check anyway.

0 Likes 0 ·
vassilis-bourdakis avatar image vassilis-bourdakis pcurtis commented ·

@pcurtis if you can afford a Teltonika RUT240 it's worth getting one. Rock solid internet and can connect to the raspberry pi using an ethernet cable. Have it now onboard for over a year and extremely happy with it, no crashes or anything running latest Large on a rpi3B+ Even running of network (or running out of data on the sim card) system is rock solid. Add or reconnect, it's up again, no reboots or locks.

0 Likes 0 ·
pcurtis avatar image pcurtis vassilis-bourdakis commented ·

Thanks for the suggestion. I really wanted a single solution for internet and display/control of Victron devices on our narrowboat. Bluetooth was marginal in range so my solution was a dedicated phone which would provide tethered internet, access to the Victron devices over a local network and a simple interface for the wife using Node-RED Dashboard on the same device for when we are on board and with the bonus of remote access via the portal when not. I will persevere for a while but splitting functions look to be the only reliable solution.

Can the RUT240 run off 12 V and what is approx power consumption?

0 Likes 0 ·
vassilis-bourdakis avatar image vassilis-bourdakis pcurtis commented ·
it has a dedicated 230V ->7.something iirc power brick. However, a friend who also bought one checked and noticed that it allows for a fairly wide DC input, so yes, wire straight to 12VDC will work fine. Haven't bothered checking the actual power consumption as I have the RUT240, a rpi3B+ and a whole bunch of NMEA2000 devices running 24/7 onboard for approx an amp (reported by BMV700 which I'm not really sure how accurate it is at the lower end of the spectrum)

Maybe have a look at the specsheet.

1 Like 1 ·
2 Answers
pcurtis avatar image
pcurtis answered ·

I think it is time to report the progress I have made before I update to Large 2.82 and add another variable. It will bring together the many helpful replies and comments I have received which ended up so deeply nested that they have become difficult to follow.

I have now followed up many of these suggestions as well as my own thoughts and I have added a lot of diagnostics to my Node-RED dashboard which have helped narrow down the problem considerably:

  • Both the Venus OS and Node-RED are running, it is just the WiFi network interface which is down and all data is safely cached and uploaded after a restart to the portal.
  • I am using a tethered WiFi Connection from an Android Phone and there have been previous problems with such connections.
  • Other users have had rock stable mobile connections using dedicated mobile routers such as the Teltonika RUT240 connected locally via an Ethernet cable.
  • There is nothing to indicate it is related to use of Venus OS Large or particular versions, any differences I have seen are more likely to be related to the phones used.
  • I have switched phones but it is too early to say if that has an impact

This problem is unlikely to affect most other users but I have edited the title to contain Tethered to act as a warning to any investigating use of tethered phones.

Anybody with a installation then can not reach easily or needing high reliability should avoid tethered phone connections. They are not designed for such use. An Ethernet cable is also likely to be much more reliable for local connections. Occasional use of a tethered phone to upload to the VRM Portal should be fine.

In my case I am using it to do my development via the portal to avoid frequent visits to a cold boat. If it fails I only have a short walk and even if I can not reach the boat due to floods I can always switch my shore power off and back on to reboot!

I have however not given up completely on investigating further but it is a slow job. I have been through the previous log files and found I had uninterrupted operation with Venus OS 2.73 for 19 days with an old 3g phone and typical periods of 5 days more recently with a 4g phone and 2.80 large. So it is very much a waiting game. I have also found at least one or two cases where a restart seems to have occurred that I did not initiate with any of my own software watchdog code which is interesting.

The problems may well be caused by the 'tethered' connection to Android but I have to note that I been using similar shared connections using cables, Bluetooth and more recently WiFi for internet connect sharing back to 2003 and the days of USB modems, certainly long before tethering was used as a description (2010??)! I have used 2g Sony Phones, an XDA Exec running Windows Mobile, Blackberry phones, Edimax 6200n 3G Wireless routers, various MiFi boxes and most recently several different Tethered Android phones. I have done this for many months every year touring and sailing in NZ, on our Narrowboat in the UK and on Cruise ships. I have never had problems up to now and a quick check back showed about 60 GBytes passed through my tethered phone during last summers boating without a need to remake a connection even when frequent mobile signal dropouts occurred due to tunnels or just poor signal. The devices were however always Android or Linux Mint (Debian based devices) rather than an Openembedded based distributions (Dunfell in 2.80) which use connman and busybox.

I will report further when[if] I come up with anything useful and always welcome any comments or suggestions.

I am also happy to make my Node-RED diagnostic dashboard available - it is still under development but is getting steadily more useful. The Screenshot from 2022-02-09 09-31-39.png shows good margins on all parameters


2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

pcurtis avatar image
pcurtis answered ·

Update: It is now coming up to 300 hours since the last anomaly so it could be a long wait to see if all my extra diagnostics are useful. The only two changes that I can see are a change of phone for the tethering and that the ambient temperatures are much higher. The phone may be significant as it uses a much earlier version of Android and 3G connections whilst the new phone can handle 5G and may have extra network capabilities which are not being handled properly ip6??.

I have also noticed that in searches for tethering problems they are mainly with iphones which randomly disconnect whilst my hotspot has always remained live and usable by other devices to use. I have not updated to 2.82 to maintain continuity in the test.

2 |3000

Up to 8 attachments (including images) can be used with a maximum of 190.8 MiB each and 286.6 MiB total.

Related Resources