Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Check a substring is contained in a string and has at least the first 4 characters

Tags:

bash

awk

The string of example is:

abcdefghijklmno

If I give in input:

abc                 FALSE    #at least 4 characters.
abcd                TRUE
cdefg               FALSE    #because the match must start from the first character.
abcde               TRUE
abcdeghi            FALSE    #because the characters must be contained consecutively.
abcdefgh            TRUE
abcdefghi           TRUE
abcdefghijklmno     TRUE
abcdefghijklmnop    FALSE    #because it exceeds the example string.

i have tried:

set -- abc
i=1
[[ abcdefghijklmno == ${!i}* ]]
echo $?

but echo "$?" returns 0 also with 3, 2, 1 or 0 characters.

This other code is obviously wrong but it is to communicate what I would like to do:

set -- abc
i=1
[[ abcdefghijklmno == ${!i}{4}* ]]
echo $?

EDIT:

The solution that suits me is the following:

set -- abc
i=1
[[ abcdefghijklmno == ${!i}* && $(expr length "${!i}") -ge 4 ]]
echo $?
like image 442
Mario Palumbo Avatar asked Mar 25 '21 09:03

Mario Palumbo


People also ask

How do you check if a string contains a substring?

You can use contains(), indexOf() and lastIndexOf() method to check if one String contains another String in Java or not. If a String contains another String then it's known as a substring. The indexOf() method accepts a String and returns the starting position of the string if it exists, otherwise, it will return -1.

How do you extract the first 5 characters from the string str?

You can use the substr function like this: echo substr($myStr, 0, 5); The second argument to substr is from what position what you want to start and third arguments is for how many characters you want to return.

How do you check if a string contains a set of characters in Python?

Using Python's "in" operator The simplest and fastest way to check whether a string contains a substring or not in Python is the "in" operator . This operator returns true if the string contains the characters, otherwise, it returns false .

How do you get the first two characters of a string?

Use the String. substring() method to get the first two characters of a string, e.g. const first2 = str. substring(0, 2); . The substring method will return a new string containing the first two characters of the original string.


2 Answers

You may us this awk:

awk -v s='abcdefghijklmno' '{
print $0, (length($1) > 3 && index(s, $1) == 1 ? "TRUE" : "FALSE")}' file | column -t

abc               FALSE
abcd              TRUE
cdefg             FALSE
abcde             TRUE
abcdeghi          FALSE
abcdefgh          TRUE
abcdefghi         TRUE
abcdefghijklmno   TRUE
abcdefghijklmnop  FALSE

Explained:

  • column command has been used for tabular output only.
  • length($1) > 3 && index(s, $1) == 1: Check condition that length of first field is greater than 3 and $1 is found from first position in the given string s.

Alternatively, we can also use a regex to check presence of $1 from start:

awk -v s='abcdefghijklmno' '{
   print $0, (length($1) > 3 && s ~ "^" $1 ? "TRUE" : "FALSE")
}' file
like image 172
anubhava Avatar answered Oct 21 '22 16:10

anubhava


The index function of Perl seems adapted: given two strings, it returns the index at which the second one occurs in the first one, or -1 if it does not occur. What you want to do is thus to check if the second string appears in the first one, at the index 0. Then, you can use the length function to make sure that the second string is more than 4 characters long

For instance,

length("abc") >= 4 && index("abcdefghijklmno", "abc") == 0                # true
length("cdefg") >= 4 && index("abcdefghijklmno", "cdefg") == 0            # false
length("abcdefghijklmno") >= 4 && index("abcdefghijklmno", "abcdefghijklmno") == 0    # true

To use it in a one-liner, one way is to provide both strings on the command line. For instance:

perl -e 'print length($ARGV[1]) >= 4 && index($ARGV[0], $ARGV[1]) == 0 ? "TRUE" : "FALSE"' abcdefghijklmno abc

Alternatively, you can sacrifice readability for conciseness by using a regular expression:

perl -e 'print $ARGV[0] =~ /^\Q$ARGV[1]\E(?<=.{4})/ ? "TRUE" : "FALSE"' abcdefghijklmno abcde

Where the regex checks if the first string starts with the second one (/^\Q$ARGV[1]\E), and that the second one is 4 characters long or more ((?<=.{4}); see perlre#lookaround-assertions).

like image 45
Dada Avatar answered Oct 21 '22 18:10

Dada