Sunday, June 14, 2009

Quick load of max field value

Sometimes, after loading a big table from several different sources I need to find the largest value of a field, or the smallest, or the average, etc. If the name of the big table is BIGTABLE and the field where I want to find the largest value is FDate then I could do something like this in the loadscript:

MAX_FDATE:
Load Max(FDate) resident BIGTABLE;

Depending on the size of BIGTABLE that could take a long time. There's a faster way - the first step is to build a table that contains all of the values of FDate. That looks like this:

ALL_FDATE:
Load FieldValue('FDate',IterNo()) as FDate
AutoGenerate(1)
While not Isnull(FieldValue('FDate',IterNo()));

That will load all of the existing distinct values of FDate almost instantaneously. It really is all of the values of FDate from all tables loaded up to that point in the loadscript and not just BIGTABLE so be aware of that if you use this method.
The second step is just to load the max FDate from our smaller ALL_FDATE table which should run much faster than loading from the resident BIG_TABLE:

MAX_FDATE:
Load Max(FDate) resident ALL_FDATE;
Drop table ALL_FDATE; //no longer needed

11 comments:

Anonymous said...

Thanks a lot for this piece of code!
It's really great!

Dae

Anonymous said...

Thanks a lot for the code!
It's great!

Dae

Anonymous said...

This is great, I wasn't aware that fielvalue returns distinct values only - is this much faster than creating a table with 'Load Distinct FDate'?

Lukas

-TB said...

The technique shown in the blog post is much faster than "load distinct fdate".

But, in QlikView version 9 there is a new loadscript LOAD syntax that can load data from existing field values ("LOAD ... FROM_FIELD...). That new syntax would be just as fast and easier to understand for someone reading the code.

Nicolas said...

I need to retrieve the max(Year) from a table that I previously loaded. How would you write the syntax using the 'From_field' statement?

Graham said...

To Tim,
Like Nicholas, I'm facing the same task and would appreciate some guidance concerning what the actual code should look like - would you be kind enough to post the script code here please?

Nerver Say No said...

Hi i need to save the output in text formatted save the file into desired location using file browser.how it possible. and i need use the file browser for importing the bookmark xml files.Plz help me on this,,
Many Thanks
vasu

Anonymous said...

Hi,

Thanks for the trick. Can this code be modified for the following problem ?

Suppose FieldX exists in Table A and FieldX also exists in Table B and both are associated via FieldX. However, I need to find min and max of FieldX only in Table A. Since the size of Table A is very big, it takes a lot of time to scan through all content.

Anonymous said...

I am also stuck in a similar situation where FieldX exists in Table A and FieldX also exists in Table B and both are associated via FieldX. However, I need to find min and max of FieldX only in Table A. Since the size of Table A is very big, it takes a lot of time to scan through all content. Therefore, please share your thoughts if you got a resolution to this problem.

Ben Wiens said...

This code snippet was very useful and reduce our incremental loadings from 7 minutes to 30 seconds. The new code snippet also didn't let the RAM explode like the old one.

So simple but so effective - thanks a lot!!!

Anonymous said...

(V11SR4)

Interestingly you cannot change the name of the source field in the LOAD statement or event a preceding LOAD. If you do, the table will contain a '?' in every row

original code:
Load FieldValue('FDate',IterNo()) as FDate

bad code:
Load FieldValue('FDate',IterNo()) as FDate_renamed_to_something_else

You will get no error but field 'FDate_renamed_to_something_else' will not contain what you expect.