Table Aggregations on whole Table / Querie limit 1000

thorsten.langner · October 10, 2022, 11:52am

Hi together,

I’d like to make aggregations on a complete table. Unfortunately they are only possible on queries and queries are still limited to 1000 positions.
This makes min, max, average, sum … useless, when the table could have more than 1000 entries.

Please allow aggregations on a complete table.

I also find queries limited to 1000 questionable.

A while ago I was asked, why I prefer our database with SQL-Connerctor over Tulip Tables.
This is one major reason!

Thanks

Thorsten

Pete_Hartnett · October 11, 2022, 12:32pm

Hey @thorsten.langner -

Great suggestion! This is one that we are constantly discussing internally. This arbitrary limit is in place to try to safeguard users from themselves. Queries and Aggregations can take many seconds (if looking at a table with 10s of millions of rows) and we wanted to remove the need for users to understand the intricacy of data science work to be able to sort or filter their tables in a user-facing application and ensure the performance of the query they make is consistent regardless of its size.

This is something we occasionally run into as well, and is something we are looking to unblock, but its something we need to do with caution so users unfamiliar with what is going on in the background have visibility into the expected performance of their applications.

Pete

thorsten.langner · October 11, 2022, 12:41pm

Hey @Pete_Hartnett,

thanks for your answer.
On the one hand, I completely agree with your fears, on the other hand, I really have concerns about receiving incorrect data without any warning.

At the moment it is very likely to get some return values, without knowing if it refers to all entries or if there are more than 1000. Then the data is not only useless, but potentially dangerous.

Pete_Hartnett · October 11, 2022, 12:43pm

Totally makes sense. This feedback is incredibly valuable. I will log it and make sure it is tracked as we continue internal discussions on how to approach this limitation.

Pete

Xin · January 21, 2023, 1:52am

Hey @Pete_Hartnett

At present, are the queries still limited to 1000 positions ?

RussWaddell · January 23, 2023, 6:46pm

The limit is still 1000 records at this time. This remains a very active topic internal to Tulip and here in the forum, but there’s no date on when we would raise the limit or deploy new features or settings related to the 1000 record limit.

emendoza · May 24, 2023, 8:53pm

Hi, any update on this topic? we have some internal issues because of this query limit, and now we need to change some stuff and make workaround to avoid this issue we have. Hopefully this query limit get increased soon.

madison.bynoe · May 24, 2023, 9:08pm

@RussWaddell Hey following up here as well. Thanks!

pete · June 12, 2023, 9:24pm

Hi @emendoza can you explain more about your use case? How many records are you looking to query?

emendoza · June 12, 2023, 10:27pm

Hi Pete, based on our current queries for now I think up to 2k records will be great, however if we have the capability to query up to 5k will be awesome!

Thanks

RickBullotta · June 19, 2023, 6:05pm

Even 5K is a relatively small number for high volume production or (soon) collection lots of high speed machine data with automations. The focus should probably be on limiting the number of records returned from a query, and optimizing aggregates, rather than overly limiting the number of raw table records that might be involved in calculating those aggregates. Ideally couldn’t most of that work can be delegated to the data store itself?

thorsten.langner · June 20, 2023, 8:53am

Exactly.

The Query-Limit must be the same, than the limit of datasets. As long as the limit is smaller, I run the risk of getting wrong results.

This is always a huge issue, because the user will not recognize that there is something wrong.

@pete Why is the use case a question? It is an issue for any use case.
Everytime you do any aggregation, you get wrong results, as soon as the table has more than 1000 entries.
This is dangerous for any serious productive usage.
Even sorting and query only 1 is not reliable for min and max (as workaround) because null values will be given as “max” (to me thats misleading).

So as long as a table could possibly have more than 1000 entries at some point, you can not use aggregations, because at some point you will get wrong results. And we have to be able to rely on the results.

RickBullotta · June 20, 2023, 12:33pm

Completely agree. Getting inaccurate data is the worst possible outcome.

pete · June 21, 2023, 1:12pm

@thorsten.langner In this case, I ask about the use case to better understand the performance requirements. For example, does the result need to be immediate (less than 1 second) or is a longer response time acceptable? Our customers have a wide range of use cases, so the extra context is always helpful. We agree that, in all cases, the data must be accurate and are reviewing this now.

thorsten.langner · September 26, 2023, 9:23am

Sorry for the late reply, I just didn’t see your post:

Here are some use cases:

Show all inventories in an interactive table, that are not marked as archived (status)
Filtering in App should be additional so its possible to add these with an “or”
Find an Order with the max or min value
find the average cost of quality report in a year
show all materials of a storage list unique (we have a storage list with ~1500 materials unique and ~10,000 in total)

In some cases the performance is more relevant, in others less.
But:

Performance is one main criteria for user acceptance
SQL databases deliver those results in milliseconds. Even with 100,000 entries. This is one additional aspect to consider when deciding between TULIP Tables and SQL Database. Aggregations was released to increase the possibilities of Tables to reduce the needs of Database-Connections.

I wonder: What is the benefit of slower results, or what is the cost of a good performance?

ChrisF · September 26, 2023, 9:43am

Hi @pete
time goes by but the limitation is still there.
So, if you (Tulip) can’t give us a solution, give us a workaround.

In my case (using the TA for filtering tasks with stored finished tasks) I got 1000 unique records of finished tasks within the last 2 month and then the interactive table filter with the TA won’t work; performance isn’t too much important for this.
So for me a workaround could be to start the aggregation from the newest entry backward and not from the first entry in the table.

So please give us more often information of the process state (last information is 3 month ago)

Regards Chris

danielpomeranz · September 26, 2023, 5:54pm

@ChrisF

If you change the sort of the query, the limit should apply to the first 1000 rows that the query pulls. This should resolve your issue.

thorsten.langner · October 4, 2023, 1:37pm

We just came up with another issue, where we needed the amount of records.
Unfortunately the Aggregation always delivers 1000

So @pete … is there any update on this issue?

jas · October 26, 2023, 6:26am

This is a critical feature gap. This leads to incorrect aggregation results - which is a no-go. Of course we can think about doing workarounds (e.g. download all table data and perform the aggregation outside of Tulip) but then, why are we using Tulip for basic things.

doneil · October 27, 2023, 3:26pm

I must have missed this since I have been using lots of aggregations and never thought to check that it was pulling in all records in the table vs. just the top 1000. Most should be fine based on sorting and # or records in the tables but a few may have been displaying incorrect info…i’ll need to investigate.

I understand performance concerns however the developers will quickly determine where aggregations can and can’t be used based on these performance issues while we are testing apps.

Is there a risk to our instances as a whole when one station is utilizing an aggregation from a massive table, or does it just cause the one station to slow down/freeze?

Topic		Replies	Views
Increased visibility to table data in app analytics Product Suggestions apps , analytics	18	577	May 26, 2022
Tulip Table Row Download/Export Limitations table , download , export	1	49	February 8, 2024
Add aggregation to Query define filters Product Suggestions	2	161	October 28, 2022
Please reconsider making at least some SQL functionality available for Tulip's native tables Product Suggestions feature-request , sql	6	342	February 6, 2024
How many columns tulip table can support Support, Troubleshooting, & Help table	10	859	June 15, 2021

Table Aggregations on whole Table / Querie limit 1000

Related Topics