Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

U-SQL How can I get the current filename being processed to add to my extract output?

Tags:

u-sql

I need to add meta data about the Row being processed. I need the filename to be added as a column. I looked at the ambulance demos in the Git repo, but can't figure out how to implement this.

like image 843
Carolus Holman Avatar asked Dec 06 '16 15:12

Carolus Holman


1 Answers

You use a feature of U-SQL called 'file sets' and 'virtual columns'. In my simple example, I have two files in my input directory, I use file sets and refer to the virtual columns in the EXTRACT statement, eg

// Filesets, file set with virtual column
@q =
    EXTRACT rowId int,
            filename string,
            extension string
    FROM "/input/filesets example/{filename}.{extension}"
    USING Extractors.Tsv();


@output =
    SELECT filename,
           extension,
           COUNT( * ) AS records
    FROM @q
    GROUP BY filename,
             extension;


OUTPUT @output TO "/output/output.csv"
USING Outputters.Csv();

My results:

U-SQL Results

Read more about both features here:

https://msdn.microsoft.com/en-us/library/azure/mt621320.aspx

like image 56
wBob Avatar answered Jan 02 '23 23:01

wBob