Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

matching for group (Expressions) in same column in Spotfire

Regarding the previous question which I have posted: calculation the difference for same column for the specific rows in Spotfire

I have a new problem for it, below is the sample:

enter image description here

The new function that i want to realize is,

  • searching for the next nearest TYPE=0 for TYPE=1
  • if the TYPE=1 has nearest TYPE=0, then mark it as 'T' in a new calculated column, otherwise as NULL

Data Rules:

  • Status column contains{1,2} in order, the default value of the null space is same as the last nearest Status value above it.
  • Type column contains only 0 and 1 randomly

output should be like this:

enter image description here

the solution what i have tried:

    If(([type]=1) and (first([type]) OVER (intersect(previous([type]),AllNext([status])))=0),"T",Null)

it looks fine, but the problem is in each status group, for example the last TYPE=1(the 5th row) in the first status=1 group, it has not next nearest TYPE=0, so the judgement would be Null. But based on the code, it is T! :(

any suggestion and idea for it? thanks a lot'! PS: some details:

  • the first Type Value of the status is NULL
  • Other null space in the status column can be filled as below, if it is helpful for the expression :):

enter image description here

like image 872
ZAWD Avatar asked Sep 27 '16 12:09

ZAWD


2 Answers

Ok, this one really tested my limits and I'm curious if it will scale. It works for your data, given where you have NULL. It took a few hours to figure out.

  1. Insert a calculated column RowId() and name it RowNum
  2. Insert a calculated column RankReal([status],"ties.method=first") and name it Rank
  3. Insert a calculated column If((first([status]) over (Previous([RowNum])) - First([status]) over ([RowNum]))=0,[Rank] - Max([RowNum]) OVER (Intersect([status],AllPrevious([RowNum])))) and name it GroupOfTypes
  4. Inert a calculated column If([type]>Min([type]) over (Intersect([GroupOfTypes],AllNext([RowNum]))),"T") and name it Marking. This is the row you really care about.

Results

enter image description here

EXPLANATION

RankReal([status],"ties.method=first")

This is done to essentially create a pseudo rownumber based on a segment of statuses. You'll see it ranks based on the status, sequentially. It's the first step in my method to group your data.

If((first([status]) over (Previous([RowNum])) - First([status]) over ([RowNum]))=0,[Rank] - Max([RowNum]) OVER (Intersect([status],AllPrevious([RowNum]))))

The first part of this, (first([status]) over (Previous([RowNum])) - First([status]) over ([RowNum]))=0 is evaluating if the previous row's [status] column is the same as the current row. If it is, it returns a boolean TRUE value. If it is the same as it's previous row, we know that it belongs in the same block / grouping, so we do some math to label this column with the same value for the entire block. That is [Rank] - Max([RowNum]) OVER (Intersect([status],AllPrevious([RowNum]))). Therefore, each row within our grouping will equate to the same value.

If([type]>Min([type]) over (Intersect([GroupOfTypes],AllNext([RowNum]))),"T")

Lastly we evaluate if the [type] is larger than the mnimal [type] over all the next rows, based on the rownumber. This limits the data we focus on to those where [type] = 1 without actually filtering the rows, while only looking forward in the dataset. If this is true we flag it with T.

like image 179
S3S Avatar answered Nov 11 '22 12:11

S3S


@ZAWD - Another way of solving this.

Step 1: Created a RowID using the expression RowId()

Step 2: Created a calculated column 'Mark0' using the expression below. This step is to find current type is not 0 and its consecutive type is 0.

Note:This column runs in the background. Need not be included in the table. Also, 100 is some dummy value which is only used to ensure that the condition is satisfied

If((Sum([type]) over ([RowID])!=0) and (Sum([type]) over (Next([RowID]))=0),100)

Step 3: Created a calculated column 'Mark1' using the expression below. This step is to find current type is not 0 and its consecutive type is also not 0 and Mark0 is filled.

Note: This column runs in the background. Need not be included in the table

If((Sum([type]) over ([RowID])!=0) and (Sum([type]) over (Next([RowID])) Is Not Null) and (first([Mark0]) over (allNext([RowID]))=100),100)

Step 4: Finally, created 'final mark' column using the expression below which marks 100 in Mark0 and Mark1 columns as 'T'

If(([Mark0]=100) or ([Mark1]=100),"T",null)

Final Table:

nearest I have tested this with your data as well as different scenarios like three consecutive 1s in 'type' column instead of two and seems to be working fine. Please test it and let me know if this is stable.

like image 44
ksp585 Avatar answered Nov 11 '22 12:11

ksp585