I have a table in Oracle with four columns.

Now user can enter input string as "operation Knee right" (which is valid) to my query and my query should return the ICD Code (IKR123) which matches most of the word in DiagnosisName column.
Following is my current query.(Not giving the proper output)
SELECT diagnosisname
FROM
  (SELECT diagnosisname,
    UTL_MATCH.jaro_winkler_similarity('%operation Knee right%',diagnosisname)
  FROM icd_code
  ORDER BY UTL_MATCH.EDIT_DISTANCE_SIMILARITY('%operation Knee right%',diagnosisname) DESC
  )
WHERE ROWNUM<2;
This query giving me the output as "Left Knee Operation" but my expectation is "Right Knee Operation".
There are few things to note about your usage of UTL_MATCH:
ORDER BY UTL_MATCH.EDIT_DISTANCE_SIMILARITY('%operation Knee right%',diagnosisname) DESC
This is not going to give you correct result. Since, you are only considering the possible similarity, however, not considering the data entry errors. So, you must use JARO_WINKLER_SIMILARITY.
operation Knee right
You need to keep in mind the CASE of the input and the column values to compare. They must be in similar case for a correct match. You are passing the input in LOWERCASE, however, your column values are in INITCAP. Better convert both the column values and the input to a similar case.
Let's look at the below demonstration to understand:
SQL> WITH DATA AS(
  2  SELECT 'Heart Operation' diagnosis_name, 'IH123' icd_code FROM dual UNION ALL
  3  SELECT 'Knee Operation' diagnosis_name, 'IK123' icd_code FROM dual UNION ALL
  4  SELECT 'Left Knee Operation' diagnosis_name, 'IKL123' icd_code FROM dual UNION ALL
  5  SELECT 'Right Knee Operation' diagnosis_name, 'IKR123' icd_code FROM dual UNION ALL
  6  SELECT 'Fever' diagnosis_name, 'IF123' icd_code FROM dual
  7  )
  8  SELECT t.*,
  9    utl_match.edit_distance_similarity(upper(diagnosis_name),upper('operation Knee right')) eds,
 10    UTL_MATCH.jaro_winkler_similarity (upper(diagnosis_name),upper('operation Knee right')) jws
 11  FROM DATA t
 12  ORDER BY jws DESC
 13  /
DIAGNOSIS_NAME       ICD_CO        EDS        JWS
-------------------- ------ ---------- ----------
Right Knee Operation IKR123         20         72
Knee Operation       IK123          20         70
Heart Operation      IH123          25         68
Left Knee Operation  IKL123         25         64
Fever                IF123          15         47
SQL>
So, you see how both are different from each other. jaro_winkler_similarity does a better job in identifying the data entry errors and giving the most close match. Based on that, simply choose the first row after sorting it in descending order:
SQL> WITH DATA AS(
  2  SELECT 'Heart Operation' diagnosis_name, 'IH123' icd_code FROM dual UNION ALL
  3  SELECT 'Knee Operation' diagnosis_name, 'IK123' icd_code FROM dual UNION ALL
  4  SELECT 'Left Knee Operation' diagnosis_name, 'IKL123' icd_code FROM dual UNION ALL
  5  SELECT 'Right Knee Operation' diagnosis_name, 'IKR123' icd_code FROM dual UNION ALL
  6  SELECT 'Fever' diagnosis_name, 'IF123' icd_code FROM dual
  7  )
  8  SELECT diagnosis_name
  9  FROM
 10    (SELECT t.*,
 11      utl_match.edit_distance_similarity(upper(diagnosis_name),upper('operation Knee right')) eds,
 12      UTL_MATCH.jaro_winkler_similarity (upper(diagnosis_name),upper('operation Knee right')) jws
 13    FROM DATA t
 14    ORDER BY jws DESC
 15    )
 16  WHERE rownum = 1
 17  /
DIAGNOSIS_NAME
--------------------
Right Knee Operation
SQL>
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With