Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python one-liner to extract field

Tags:

python

Input:

$ ./ffmpeg -i test020.3gp                                                                                                               
ffmpeg version UNKNOWN, Copyright (c) 2000-2011 the FFmpeg developers
  built on May  5 2011 14:30:25 with gcc 4.4.3
  configuration: 
  libavutil    51.  2. 0 / 51.  2. 0
  libavcodec   53.  3. 0 / 53.  3. 0
  libavformat  53.  0. 3 / 53.  0. 3
  libavdevice  53.  0. 0 / 53.  0. 0
  libavfilter   2.  4. 0 /  2.  4. 0
  libswscale    0. 14. 0 /  0. 14. 0
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'test020.3gp':
  Metadata:
    major_brand     : 3gp4
    minor_version   : 512
    compatible_brands: 3gp4
    creation_time   : 2004-07-01 09:59:21
  Duration: 00:01:02.20, start: 0.000000, bitrate: 284 kb/s
    Stream #0.0(und): Audio: aac, 44100 Hz, stereo, s16, 96 kb/s
    Metadata:
      creation_time   : 2004-07-01 09:59:21
    Stream #0.1(und): Video: mpeg4, yuv420p, 176x120 [PAR 1:1 DAR 22:15], 184 kb/s, 15 fps, 15 tbr, 30k tbn, 15 tbc
    Metadata:
      creation_time   : 2004-07-01 09:59:23
At least one output file must be specified

Let's say I would like to extract the width & height using the following regexp:

(\d+x\d+)

Using perl, I'd do do something like this:

$ ./ffmpeg -i test020.3gp 2>&1 | perl -lane 'print $1 if /(\d+x\d+)/'
176x120

Then I'd tried to construct a similar python one-liner, it sort-of works, but not perfectly:

$ ./ffmpeg -i test020.3gp 2>&1 | python -c "import sys,re;[sys.stdout.write(str(re.findall(r'(\d+x\d+)', line))) for line in sys.stdin]"
[][][][][][][][][][][][][][][][][][][]['176x120'][][][]

What do a python one-liner looks like that corresponds to the perl one?

like image 825
Fredrik Pihl Avatar asked Jan 18 '23 18:01

Fredrik Pihl


2 Answers

What you want is re.search instead of re.findall.

This does the trick, even if the one-liner itself is a bit "ugly" (/tmp/p is just the sample data you gave):

% cat /tmp/p 2>&1 | python -c "import re,sys; print re.search(r'(\d+x\d+)', sys.stdin.read()).group()"
176x120

Any reason you're not just using grep (egrep in this case)?

% cat /tmp/p | egrep -o '[0-9]+x[0-9]+'
176x120
like image 152
jathanism Avatar answered Jan 28 '23 03:01

jathanism


cat sample.txt | python -c "import sys,re; print '\n'.join(re.findall(r'(\d+x\d+)',sys.stdin.read()))"
176x120
like image 26
MattH Avatar answered Jan 28 '23 02:01

MattH