Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to sort a large text file alphabetically?

Tags:

python

sorting

So I have a text file and I need to sort the lines alphabetically. Example input:

This is the first sentence
A sentence here as well
But how do I reorder them?

Output:

A sentence here as well
But how do I reorder them?
This is the first sentence

Here's the thing: This file is so large, I don't have enough RAM to actually split it into a list/array. I tried to use Python's built-in sorted() function and the process got killed.

To give you an idea:

wc -l data
21788172 data
like image 558
Rstevoa Avatar asked Apr 07 '14 01:04

Rstevoa


1 Answers

It sounds like you need to do a merge-sort: divide the file into blocks, sort each block, then merge the sorted blocks back together. See Python class to merge sorted files, how can this be improved?

like image 104
Hugh Bothwell Avatar answered Oct 16 '22 22:10

Hugh Bothwell