On premise connector host docker container keeps crashing

Hi everyone,

We have an on-prem connector host that is running in a docker container on a Linux VM.

It was working fine for about 6 months, but last Friday the docker container just starting crashing and restarting for no apparent reason.

I was able to fail over to a pair of Edge IO boxes to use as temporary connector hosts to keep the line running, but we cannot figure out what is causing the docker container to constantly crash.

We tried removing the container and downloading a new one following the instructions here: Overview of On-Premise Connector Hosts (tulip.co)

But the new container is doing the same thing the old one did. It will come online for a few seconds, then crash and restart.

Has anyone experienced something like this? Our IT department does not have resources for troubleshooting Linux.

Thanks,

-Steve

Hi Steve,

I know you have been talking with the Tulip team about this already, but I wanted to tap a couple of Community members to see if they have any advice for learning more about Linux / if they have any insights to provide for your specific situation here.

@Dave.ESSInc, @Alan.Madorin, @mellerbeck, @gaurav.garg - feel free to chime in here if there is any help you can provide :slight_smile:

It might be easiest to just start from scratch, build a new VM (on prem or in the cloud?) and request a net new connector host. We have had some crash loops in the past. It was related to having an incredibly large amount of connector functions (and even archived ones were being loaded in) I haven’t seen this issue for about a year now with recent versions of the connector host (What version are you running). Another thing to check, does your host have enough resources?

3 Likes

Recently we also encountered a similar issue. We have a similar setup like you we have connectors hosted on EC2 machine. Our containers were failing because we were having resource constraints such as RAM and Diskspace on that VM. I think you might also be facing resource issues. Please verify if you have enough resources to run a docker container. Tulip recommends 4 Gb RAM and 32 Gb disk space.

2 Likes

Thanks for the comments everyone. Our container is stable and has been up for 3 days now (but nobody seems to have done anything to fix it). Unfortunately, we did not get logs from the time it was failing so I can’t check if we had resource constraints at the time.

One thought from Tulip support was the log rotations were not enabled so the log file finally got too big and that was causing memory constraints (again, I’m not able to verify this was the case).

Here’s the article they pointed me to for enabling log rolling if anyone else experiences the same issue: Enabling log-rotations for existing on-premise Connector Host container (tulip.co)

1 Like

Ah yes, log rotation will break you!