Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash: Sort text file by bytewise case sensitive sort command or using python sort command

Text File
using sort -s (case sensitive)

Act
Bad
Bag
Card
East
about
across
back
ball
camera
canvas
danger
dark
early
edge

using sort -f (Not case sensitive)

about
across
Act
back
Bad
Bag
ball
camera
canvas
Card
danger
dark
early
East
edge

The words starting with an uppercase are sorted alphabetically between the lowercase words.

What I want is that the words in uppercase are at the start of each next letter (upercase alphabetically sorted):
Expected output:

Act
about
across
Bad
Bag
back
ball
Card
camera
canvas
danger
dark
East
early
edge

How can I realize this using bash or python sort command?

like image 782
Reman Avatar asked Dec 25 '22 08:12

Reman


2 Answers

This command will do it:

LC_ALL=C sort -k 1.1f,1.1 PATH

where PATH is your file path.

Explanation:

  • sort collation order is affected by the current locale, so LC_ALL=C is used to set the locale to a known value (the POSIX locale, collation order based on ASCII character code values)
  • -k 1.1f,1.1 tells sort to use the first character as the primary sort key in a case-insensitive manner
  • Equal comparisons of the primary key will be resolved by comparing all characters again (this time, in a case-sensitive manner).

The output is exactly as requested in the question.

like image 114
isedev Avatar answered Jan 30 '23 21:01

isedev


I managed to do it! Here it is (in Python):

import string

words = [...]
newwords = sorted(words, key=lambda x: [ord(y) if y in string.lowercase else ord(y.lower()) - 0.5 for y in x])
print(newwords)

Output:

['Act', 'about', 'across', 'Bad', 'Bag', 'back', 'ball', 'Card', 'camera', 'canvas', 'danger', 'dark', 'East', 'early', 'edge']
like image 40
zondo Avatar answered Jan 30 '23 21:01

zondo