Programmatically Getting Data Out of Tulip: Pseudo-code

There are many options for exporting CSV data from the Tulip Tables interface manually. However, there are sometimes use cases which require getting data out of Tulip Tables programmatically. The Tulip Table API has a limit of 100 records at a time, which is usually all that is needed. But sometimes there is a need to pull an entire tulip table into a CSV or a database. For this reason, I wanted to provide a general explanation for how this can be done.

The following diagram outlines pseudocode for the simplest possible way to “paginate” through the Tulip Table API and pull every row, in my opinion. This algorithm can be executed in many programming languages and tools, so long as you understand the logic.

It can be very time consuming to download an entire Tulip table every time you need to update your data. If you are regularly pulling the data from the same table, you can speed this up by filtering on the updated_at column. This is a default metadata column of tulip table records. By using this column and logging timestamp of the latest execution of your script, you can be assured that the next time you pull your script you will only be pulling records which have been created or changed since the last time the script was run. The following diagram outlines how this might look:

NOTE – Using this method, data which is deleted in tulip will not be detected by the algorithm to be removed in the data warehouse. Solutions for this problem can vary. We suggest you avoid deleting any rows of data and instead use a column value to indicate archived data if this is a must have.

That why build a event driven pipeline that enable to publish in a broker new data in Tulip will simply this « pulling » process and make Tulip data available in real time for others solutions or platforms.