Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find comments using %, ingore printf with %d using regex

Tags:

regex

matlab

I'm trying to pull out comments in a MATLAB file. In MATLAB, comments are denoted with % so the sensible thing would be to search for %.*. However, MATLAB also has functions like sprintf and fprintf which allow something like sprintf('x = %d', 5) and that regex would find %d', 5) as well, which I don't want. Of course I'd also want to ignore variations such as %s or %f. Is there a way to capture only those segments that match %.* but which are not enclosed in ' characters? I suppose I should clarify that I'm generally trying to capture comments starting with %, but ignoring any % within string literals. The sprintf was simply an example of such an occurence that I want to ignore.

I found this question, which seems related, but no solutions posted there solve my problem.

like image 209
zephyr Avatar asked Dec 11 '25 19:12

zephyr


2 Answers

My final regex :

  • ^(^[^']+|[^']+('.*')+[^']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)
regexp('%i am a comment', '^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')

response:

com2: 'i am a comment'
com: []

 regexp('printf () ; %i am a comment after a command','^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')

response:

 com2: []
 com: 'i am a comment after a command'

  regexp('printf ('' % i m not a comment '') , %i am a comment after a command followed by comma', '^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')

Response:

com2: []
 com: 'i am a comment after a command followed by comma'

This case to make sure the comment isnt caught:

regexp('printf('' ;%i m not a comment '');', '^(^[^'']+|[^'']+(''.*'')+[^'']+)?(;|,)\s*%(?<com>.*)|^(\s)*%(?<com2>.*)', 'names')

ans =

0x0 struct array with fields:
com2
com

the comments are stored in variables com and com2

like image 143
Abr001am Avatar answered Dec 13 '25 09:12

Abr001am


This doesn't meet the question's requirements, but I thought I'd share it anyway.

If MATLAB is accessible, then you can use the publish function, then pull out the comments with grep.

So for the following function in myfun.m

function [out] = myfun(n) 
% Comment
out = ['% Not a ',... this is a comment too
    'comment'];
fprintf('%d',n)%do this
%{
 Multiline
 comment
%}

we run

publish('myfun.m')

which produces the file html/myfun.html. Now with e.g. bash, we can run

egrep -o -e "<span class=\"comment\">.*?</span>" html/myfun.html

which returns

<span class="comment">% Comment</span>
<span class="comment"> this is a comment too</span>
<span class="comment">%do this</span>
<span class="comment">%}</span>

This is not quite there, since publish has split lines like this

<span class="comment">%{
</span><span class="comment"> Multiline
</span><span class="comment"> comment, n&gt;2
</span><span class="comment">%}</span>

This needs How can I search for a multiline pattern in a file?

like image 43
Steve Avatar answered Dec 13 '25 09:12

Steve



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!