Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FileTable Delimited String Split

EDIT - To be clear I will put the goal sentence at the top. The test and my question is whether there is a way to get the same performance as a temp table without using a temp table.

I feel this should be an easy question but I am stuck. I am experimenting with FileTables in SQL2014. I know of a few alternatives that would work well but the goal is to establish feasibility of extracting substrings of text from a filetable.

This test has 35,000 text files with one line of text as follows with each file having an average of 100 bytes of non-unicode text.

Aaa|Bbb|Ccc|Ddd|Eee|Fff|Ggg

The desired output is one row for each file and the delimited string to be split into seven columns.

I have found a quick string parser function, but running on the filestream has a significant performance impact compared to a varchar column.

This query takes 18 seconds to run. I am trying to have the conversion from filestream to varchar execute only once but I think calling the UDF may be causing it to happen for every row(file).

Create View vAddresses As
Select file_type, Convert(Varchar(8000),file_stream) TextData /* Into #Temp */ From InputFiles Where file_type = 'adr'
Go
Select  --TextData,
        dbo.udf_StringSplit(TextData, 1, '|'), dbo.udf_StringSplit(TextData, 2, '|'), dbo.udf_StringSplit(TextData, 3, '|'),
        dbo.udf_StringSplit(TextData, 4, '|'), dbo.udf_StringSplit(TextData, 5, '|'), dbo.udf_StringSplit(TextData, 6, '|'),
        dbo.udf_StringSplit(TextData, 7, '|')--, TextData
    From vAddresses

I have tried it as a view, a cte and a subquery. The only thing that seems to help is creating a temp table. Creating the temp table takes 1 second and the query takes one second. So for 35k rows 2 seconds total query time vs. 18 seconds.

Drop Table #Temp
(Select file_type, Convert(Varchar(8000),file_stream) TextData Into #Temp From HumanaInputFiles Where file_type = 'adr')
Select  --TextData,
        dbo.udf_StringSplit(TextData, 1, '|'), dbo.udf_StringSplit(TextData, 2, '|'), dbo.udf_StringSplit(TextData, 3, '|'),
        dbo.udf_StringSplit(TextData, 4, '|'), dbo.udf_StringSplit(TextData, 5, '|'), dbo.udf_StringSplit(TextData, 6, '|'),
        dbo.udf_StringSplit(TextData, 7, '|')--, TextData
    From #Temp

I have read many posts and blogs on both the topics of filetables and temp table vs single query performance but I cant seem to figure it out. It might have something to do with sargable or statistics? Any advice is much appreciated.

Here is the UDF, I found it on an MSDN blog / forum and is the best performer I have found so far.

ALTER FUNCTION [dbo].[udf_StringSplit](
 @TEXT      varchar(8000)
,@COLUMN    tinyint
,@SEPARATOR char(1)
)RETURNS varchar(8000)
AS
  BEGIN
       DECLARE @POS_START  int = 1
       DECLARE @POS_END    int = CHARINDEX(@SEPARATOR, @TEXT, @POS_START)

       WHILE (@COLUMN >1 AND @POS_END> 0)
         BEGIN
             SET @POS_START = @POS_END + 1
             SET @POS_END = CHARINDEX(@SEPARATOR, @TEXT, @POS_START)
             SET @COLUMN = @COLUMN - 1
         END 

       IF @COLUMN > 1  SET @POS_START = LEN(@TEXT) + 1
       IF @POS_END = 0 SET @POS_END = LEN(@TEXT) + 1 

       RETURN SUBSTRING (@TEXT, @POS_START, @POS_END - @POS_START)
  END

This is the execution plan for the temp table.

<?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.2" Build="12.0.4100.1" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="17486" StatementId="1" StatementOptmLevel="TRIVIAL" CardinalityEstimationModelVersion="120" StatementSubTreeCost="0.166487" StatementText="Select --TextData,&#xD;&#xA;       dbo.udf_StringSplit(TextData, 1, '|'), dbo.udf_StringSplit(TextData, 2, '|'), dbo.udf_StringSplit(TextData, 3, '|'),&#xD;&#xA;      dbo.udf_StringSplit(TextData, 4, '|'), dbo.udf_StringSplit(TextData, 5, '|'), dbo.udf_StringSplit(TextData, 6, '|'),&#xD;&#xA;      dbo.udf_StringSplit(TextData, 7, '|')--, TextData&#xD;&#xA; From #Temp" StatementType="SELECT" QueryHash="0xC4D6F0215D332F3D" QueryPlanHash="0xC50CFAF9494B5DBE" RetrievedFromCache="true">
          <StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />
          <QueryPlan DegreeOfParallelism="0" NonParallelPlanReason="CouldNotGenerateValidParallelPlan" CachedPlanSize="24" CompileTime="1" CompileCPU="1" CompileMemory="168">
            <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0" />
            <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="838735" EstimatedPagesCached="419367" EstimatedAvailableDegreeOfParallelism="4" />
            <RelOp AvgRowSize="28023" EstimateCPU="0.0017486" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="17486" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.166487">
              <OutputList>
                <ColumnReference Column="Expr1003" />
                <ColumnReference Column="Expr1004" />
                <ColumnReference Column="Expr1005" />
                <ColumnReference Column="Expr1006" />
                <ColumnReference Column="Expr1007" />
                <ColumnReference Column="Expr1008" />
                <ColumnReference Column="Expr1009" />
              </OutputList>
              <RunTimeInformation>
                <RunTimeCountersPerThread Thread="0" ActualRows="17486" ActualEndOfScans="1" ActualExecutions="1" />
              </RunTimeInformation>
              <ComputeScalar>
                <DefinedValues>
                  <DefinedValue>
                    <ColumnReference Column="Expr1003" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([tempdb].[dbo].[#Temp].[TextData],(1),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(1)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1004" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([tempdb].[dbo].[#Temp].[TextData],(2),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(2)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1005" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([tempdb].[dbo].[#Temp].[TextData],(3),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(3)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1006" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([tempdb].[dbo].[#Temp].[TextData],(4),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(4)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1007" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([tempdb].[dbo].[#Temp].[TextData],(5),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(5)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1008" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([tempdb].[dbo].[#Temp].[TextData],(6),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(6)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1009" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([tempdb].[dbo].[#Temp].[TextData],(7),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(7)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                </DefinedValues>
                <RelOp AvgRowSize="4011" EstimateCPU="0.0193131" EstimateIO="0.145426" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="17486" LogicalOp="Table Scan" NodeId="1" Parallel="false" PhysicalOp="Table Scan" EstimatedTotalSubtreeCost="0.164739" TableCardinality="17486">
                  <OutputList>
                    <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                  </OutputList>
                  <RunTimeInformation>
                    <RunTimeCountersPerThread Thread="0" ActualRows="17486" ActualEndOfScans="1" ActualExecutions="1" />
                  </RunTimeInformation>
                  <TableScan Ordered="false" ForcedIndex="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
                    <DefinedValues>
                      <DefinedValue>
                        <ColumnReference Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" Column="TextData" />
                      </DefinedValue>
                    </DefinedValues>
                    <Object Database="[tempdb]" Schema="[dbo]" Table="[#Temp]" IndexKind="Heap" Storage="RowStore" />
                  </TableScan>
                </RelOp>
              </ComputeScalar>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>

This is the plan for the view.

<?xml version="1.0" encoding="utf-16"?>
<ShowPlanXML xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" Version="1.2" Build="12.0.4100.1" xmlns="http://schemas.microsoft.com/sqlserver/2004/07/showplan">
  <BatchSequence>
    <Batch>
      <Statements>
        <StmtSimple StatementCompId="1" StatementEstRows="17486" StatementId="1" StatementOptmLevel="FULL" StatementOptmEarlyAbortReason="GoodEnoughPlanFound" CardinalityEstimationModelVersion="120" StatementSubTreeCost="0.905265" StatementText="Select    --TextData,&#xD;&#xA;       dbo.udf_StringSplit(TextData, 1, '|'), dbo.udf_StringSplit(TextData, 2, '|'), dbo.udf_StringSplit(TextData, 3, '|'),&#xD;&#xA;      dbo.udf_StringSplit(TextData, 4, '|'), dbo.udf_StringSplit(TextData, 5, '|'), dbo.udf_StringSplit(TextData, 6, '|'),&#xD;&#xA;      dbo.udf_StringSplit(TextData, 7, '|')--, TextData&#xD;&#xA; From vAddresses" StatementType="SELECT" QueryHash="0xB4F8A0B288802C4E" QueryPlanHash="0x28DA02D774B1AF53" RetrievedFromCache="true">
          <StatementSetOptions ANSI_NULLS="true" ANSI_PADDING="true" ANSI_WARNINGS="true" ARITHABORT="true" CONCAT_NULL_YIELDS_NULL="true" NUMERIC_ROUNDABORT="false" QUOTED_IDENTIFIER="true" />
          <QueryPlan DegreeOfParallelism="0" NonParallelPlanReason="CouldNotGenerateValidParallelPlan" CachedPlanSize="32" CompileTime="3" CompileCPU="3" CompileMemory="520">
            <Warnings>
              <PlanAffectingConvert ConvertIssue="Cardinality Estimate" Expression="CONVERT(varchar(8000),[DmProd01].[dbo].[HumanaInputFiles].[file_stream],0)" />
            </Warnings>
            <MemoryGrantInfo SerialRequiredMemory="0" SerialDesiredMemory="0" />
            <OptimizerHardwareDependentProperties EstimatedAvailableMemoryGrant="838735" EstimatedPagesCached="419367" EstimatedAvailableDegreeOfParallelism="4" />
            <RelOp AvgRowSize="28023" EstimateCPU="0.0017486" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="17486" LogicalOp="Compute Scalar" NodeId="0" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.905265">
              <OutputList>
                <ColumnReference Column="Expr1004" />
                <ColumnReference Column="Expr1005" />
                <ColumnReference Column="Expr1006" />
                <ColumnReference Column="Expr1007" />
                <ColumnReference Column="Expr1008" />
                <ColumnReference Column="Expr1009" />
                <ColumnReference Column="Expr1010" />
              </OutputList>
              <RunTimeInformation>
                <RunTimeCountersPerThread Thread="0" ActualRows="17486" ActualEndOfScans="1" ActualExecutions="1" />
              </RunTimeInformation>
              <ComputeScalar>
                <DefinedValues>
                  <DefinedValue>
                    <ColumnReference Column="Expr1004" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([Expr1011],(1),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1011" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(1)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1005" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([Expr1011],(2),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1011" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(2)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1006" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([Expr1011],(3),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1011" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(3)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1007" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([Expr1011],(4),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1011" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(4)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1008" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([Expr1011],(5),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1011" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(5)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1009" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([Expr1011],(6),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1011" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(6)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                  <DefinedValue>
                    <ColumnReference Column="Expr1010" />
                    <ScalarOperator ScalarString="[DmProd01].[dbo].[udf_StringSplit]([Expr1011],(7),'|')">
                      <UserDefinedFunction FunctionName="[DmProd01].[dbo].[udf_StringSplit]">
                        <ScalarOperator>
                          <Identifier>
                            <ColumnReference Column="Expr1011" />
                          </Identifier>
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="(7)" />
                        </ScalarOperator>
                        <ScalarOperator>
                          <Const ConstValue="'|'" />
                        </ScalarOperator>
                      </UserDefinedFunction>
                    </ScalarOperator>
                  </DefinedValue>
                </DefinedValues>
                <RelOp AvgRowSize="4019" EstimateCPU="0.0034972" EstimateIO="0" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="17486" LogicalOp="Compute Scalar" NodeId="1" Parallel="false" PhysicalOp="Compute Scalar" EstimatedTotalSubtreeCost="0.88673">
                  <OutputList>
                    <ColumnReference Column="Expr1011" />
                  </OutputList>
                  <ComputeScalar>
                    <DefinedValues>
                      <DefinedValue>
                        <ColumnReference Column="Expr1011" />
                        <ScalarOperator ScalarString="CONVERT(varchar(8000),[DmProd01].[dbo].[HumanaInputFiles].[file_stream],0)">
                          <Convert DataType="varchar" Length="8000" Style="0" Implicit="false">
                            <ScalarOperator>
                              <Identifier>
                                <ColumnReference Database="[DmProd01]" Schema="[dbo]" Table="[HumanaInputFiles]" Column="file_stream" />
                              </Identifier>
                            </ScalarOperator>
                          </Convert>
                        </ScalarOperator>
                      </DefinedValue>
                    </DefinedValues>
                    <RelOp AvgRowSize="4043" EstimateCPU="0.0386262" EstimateIO="0.844606" EstimateRebinds="0" EstimateRewinds="0" EstimatedExecutionMode="Row" EstimateRows="17486" LogicalOp="Table Scan" NodeId="2" Parallel="false" PhysicalOp="Table Scan" EstimatedTotalSubtreeCost="0.883233" TableCardinality="34972">
                      <OutputList>
                        <ColumnReference Database="[DmProd01]" Schema="[dbo]" Table="[HumanaInputFiles]" Column="file_stream" />
                      </OutputList>
                      <RunTimeInformation>
                        <RunTimeCountersPerThread Thread="0" ActualRows="17486" ActualEndOfScans="1" ActualExecutions="1" />
                      </RunTimeInformation>
                      <TableScan Ordered="false" ForcedIndex="false" ForceScan="false" NoExpandHint="false" Storage="RowStore">
                        <DefinedValues>
                          <DefinedValue>
                            <ColumnReference Database="[DmProd01]" Schema="[dbo]" Table="[HumanaInputFiles]" Column="file_stream" />
                          </DefinedValue>
                        </DefinedValues>
                        <Object Database="[DmProd01]" Schema="[dbo]" Table="[HumanaInputFiles]" IndexKind="Heap" Storage="RowStore" />
                        <Predicate>
                          <ScalarOperator ScalarString="[DmProd01].[dbo].[HumanaInputFiles].[file_type]=N'adr'">
                            <Compare CompareOp="EQ">
                              <ScalarOperator>
                                <Identifier>
                                  <ColumnReference Database="[DmProd01]" Schema="[dbo]" Table="[HumanaInputFiles]" Column="file_type" ComputedColumn="true" />
                                </Identifier>
                              </ScalarOperator>
                              <ScalarOperator>
                                <Const ConstValue="N'adr'" />
                              </ScalarOperator>
                            </Compare>
                          </ScalarOperator>
                        </Predicate>
                      </TableScan>
                    </RelOp>
                  </ComputeScalar>
                </RelOp>
              </ComputeScalar>
            </RelOp>
          </QueryPlan>
        </StmtSimple>
      </Statements>
    </Batch>
  </BatchSequence>
</ShowPlanXML>

EDIT: Searched on this differently and found an answer is to use top and order by. This got it down to 4 seconds. Seems kind of hokey and still dont explain how looking at query plans helps figure this out, so not going to answer this myself, instead leave it open.

like image 756
Joe C Avatar asked Jul 04 '17 15:07

Joe C


People also ask

What is the difference between FileStream and FileTable?

FileStream and FileTable are features of SQL Server for storing unstructured data in SQL Server alongside other data. The FileStream feature stores unstructured data in the file system and keeps a pointer of the data in the database, whereas FileTable extends this feature even further allowing non-transactional access.

How do I split a string in SQL?

The STRING_SPLIT(string, separator) function in SQL Server splits the string in the first argument by the separator in the second argument. To split a sentence into words, specify the sentence as the first argument of the STRING_SPLIT() function and ' ' as the second argument.


1 Answers

There are recomendations in MSDN to not use FileTable in your case:

The FileTable feature builds on top of SQL Server FILESTREAM technology.

For small objects perfomance of FileStream is not good. Filestream is designed to work with files about 1MB and more, but you have only 100 bytes(https://docs.microsoft.com/en-us/sql/relational-databases/blob/filestream-sql-server):

When to Use FILESTREAM In SQL Server, BLOBs can be standard varbinary(max) data that stores the data in tables, or FILESTREAM varbinary(max) objects that store the data in the file system. The size and use of the data determines whether you should use database storage or file system storage. If the following conditions are true, you should consider using FILESTREAM:

  • Objects that are being stored are, on average, larger than 1 MB.
  • Fast read access is important.
  • You are developing applications that use a middle tier for application logic.

For smaller objects, storing varbinary(max) BLOBs in the database often provides better streaming performance.

You can emulate persistent column - create another table and populate it with triggers. In this case you can receive bonuses from both cases

P.S. You can use inline TVF + union all to implement @TT 's comment

like image 200
Mikhail Lobanov Avatar answered Sep 24 '22 14:09

Mikhail Lobanov