Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

gnuplot: use regular expressions to parse string

Tell me PLZ how in the gnuplot script you can

1) parse a string and extract a number and a letter/string from it?

2) is it possible to use associative arrays so as not to use multi IF?

files = system(sprintf("dir /b \"%s*.csv\"", inputPath))

do for [name in files]{

    # MY TROUBLES IS HERE
    [value, typeID] = parse(name, "*[%d%s]*"); # pseudocode
    typesList = {"h": 3600, "m": 60, "s": 1};

    scale = value * typesList[typeID];
    # MY TROUBLES IS ABOVE

    myfunc(y) = y * scale

    outputName = substr(name, 0, strlen(name) - strlen(".csv"))

    inputFullPath = inputPath.name
    outputFullPath = outputPath.outputName.outputExt

    plot inputFullPath using 1:(myfunc($2)) with lines ls 1 notitle
}

In my case, I need to get the number of seconds from the file name of the form ...[d=17s]..., ...[d=2m]..., ...[d=15h]... etc

In a more complicated case: ...[d = 2h7m31s]... (this is a general case, it is unlikely to be useful to me, but it would be interesting to know how to resolve it)

like image 760
Zhihar Avatar asked Sep 19 '25 06:09

Zhihar


1 Answers

gnuplot does not support regular expressions, but you can write a function which extracts the times in seconds from your filename. If your filename and timestamp have a strict format, e.g. like "...[d=2h7m31s]..." you could use the following code. Otherwise you have to adapt it accordingly.

  1. First extract the 2h7m31s part with strstrt()
  2. parse it with strptime()
  3. and make an integer out of it with int()

Script:

### parse special time string

NAME = "Filename[d=2h7m31s].csv"

TimeExtract(s) = int(strptime("%Hh%Mm%Ss",s[strstrt(s,'[d=')+3:strstrt(s,']')-1]))
    
print TimeExtract(NAME)
### end of code

Result:

7651

Addition:

the following code also covers other possibilities as long as the sequence is ...[d=..h..m..s]....

Update: (hopefully the final version)

The timeformat %H would wrap at 24 hours (actually, here it does at 100 h). So, in order to get the correct time in seconds, the specifier should be %tH, %tM and %tS (check help time_specifiers). With this, you can also parse strange formats like [d=100h100m100s].

Script:

### parse special time string
reset session

$Data <<EOD
abcd[d=31s]somethingelse.csv
efghi[d=7m]somethingelse.csv
jklmn[d=2h]somethingelse.csv
op[d=7m31s]somethingelse.csv
qr[d=2h31s]somethingelse.csv
uvw[d=2h7m]somethingelse.csv
xyz[d=2h7m31s]somethingelse.csv
aaa[d=100h100m100s]strangetime.csv
EOD

getTimeString(s) = s[strstrt(s,'[d=')+3:strstrt(s,']')-1]

getTimeFormat(s) = \
    (strstrt(getTimeString(s),'h') ? '%tHh' : '').\
    (strstrt(getTimeString(s),'m') ? '%tMm' : '').\
    (strstrt(getTimeString(s),'s') ? '%tSs' : '')

extractTime(s) = int(strptime(getTimeFormat(s),getTimeString(s)))

do for [i=1:|$Data|] {
    s = $Data[i]
    print sprintf("% 12s   %d",getTimeString(s),extractTime(s))
}
### end of script

Result:

         31s   31
          7m   420
          2h   7200
       7m31s   451
       2h31s   7231
        2h7m   7620
     2h7m31s   7651
100h100m100s   366100
like image 137
theozh Avatar answered Sep 22 '25 03:09

theozh