please think about a more robust way to react to global connector outages.
Maybe even on an app-by-app basis so that apps that heavily rely on connector interaction stop working when the connector host goes down.
We have run here into situations where the trigger-by-trigger check and error handling is not adequate to ensure that an app continues to function as expected.
Recovery from an outage has proven way to complex to implement using the current features and limited error handling capabilities.
Preventing the operator from continuing when the connector host goes down would ensure that a) operators are immediately aware that there is a major issue and b) would prevent that a trigger cascade may execute stuff that is bound to fail if the connector function is not working.
Thanks for your patience, this drove an interesting conversation internally.
The root goal is to block the execution of logic if the connection to an external system fails. This can happen for a few reasons:
Player loses connection to the internet - this is already addresed by player (and the red bar we all know and love), so lets pretend this doesn’t exist.
The Connector host loses its connection to the cloud - This is what you are speaking about - Not relevant for cloud connector host.
The underlying service we are connected to has some sort of issue. This could be across all calls, or across just specific routes or inputs. In a perfect world, API specifications would be perfectly met, and external systems would always be availble and performant, unfortunatly this is not usually the case.
Ideally we deliver a solution that can address failure mode 2 and 3, otherwise you would still need some countermeasures in your logic to catch these failures. This is a bit easier said than done, especially to catch failure more 3 gracefully.
There is no way to proactivly ensure a connector call will succeed, so we are limited to the action we enable when something fails. There are a few approaches we have discussed:
Adding an “on error” section to triggers allowing you to build some additional logic in the case something happens when a trigger fails. This would allow you to lock down an application if a specific call fails.
Exposing this event as an event for automations, so a similar flow could be built, but it would be centralized and shared across all usage of respective function.
Allow users to decided to block any following actions if a connector fails.
Add a full user defined workflow at the function level that would run synctronously if a call fails
Do any of these strike you as the best solution to address your needs?
Pete
Thank you Pete for taking a closer look into this. I totally understand that this is not easy to resolve.
I think 3 is already there given that there is the option that you can ask Tulip to stop execution of any further triggers if an error has occurred.
I would certainly look for 1 to be implemented to gain more control about a graceful fallback. See also this post here for additional discussion of the pattern Multi-level conditions in triggers - #14 by pete
The automations approach could be a nice addition for sure, but for our challenge I think its is too generic as you would be loosing the context under which the error occurred within the app.