I have a file, which consists of a one row:
1 , 1 2 , 1 3 6 , 4 ,...
In this representation, spaces separate the integers and commas. This string is so huge that I can't read it with RandomAccessFile.readLine() (almost 4 Gb needed). So that I created a buffer, which can contain 10 integers. My task is to sort all integers in the string.
Could you, please, help?
EDIT
@Oscar Reyes
I need to write some sequences of integers to a file and then to read from it. Actually I don't know, how to do it. I'm a newbie. So I decided to use chars to write integers, delimiters between integers are ",", and delimeters between sequences are "\n\r" which. So that I created a monster that reads it:
public BinaryRow getFilledBuffer(String filePath, long offset) throws IOException{
mainFile = new RandomAccessFile(filePath, "r");
if (mainFile.length() == 0){
return new BinaryRow();
}
StringBuilder str = new StringBuilder();
mainFile.seek(mainFile.length()-4); //that is "\n" symbol
char chN = mainFile.readChar();
mainFile.seek(offset);
int i = 0;
char nextChar = mainFile.readChar();
while (i < 11 && nextChar != chN){
str.append(nextChar);
if (nextChar == ','){
i++;
if (i == 10){
break;
}
}
nextChar = mainFile.readChar();
}
if (nextChar == chN){
position = -1;
}else{
position = mainFile.getFilePointer();
}
BinaryRow br = new BinaryRow();
StringBuilder temp = new StringBuilder();
for (int j = 0; j < str.length(); j++){
if ((str.charAt(j) != ',')){
temp.append(str.charAt(j));
if (j == str.length() - 1){
br.add(Integer.parseInt(temp.toString()));
}
}else{
br.add(Integer.parseInt(temp.toString()));
temp.delete(0, temp.length());
}
}
mainFile.close();
return br;
}
If you could advise how to do it, please do it =)
This is exactly the origin QuickSort back then there was not enough RAM to sort in memory so they procedure is to store partial results in disk.
So what you can do is:
When parts are small enough ( like 2 just direct swap them Enough to be sorted in memory )
This way you'll be able to sort in chunks and store the partial results in temp files and you'll have a final file with the result sorted.
EDIT I told you a quick sort was possible.
It seems like you would need some extra space for the temp files after all.
Here's what I did.
I create a 40 mb file with numbers separated by commas.
I name it input
:
input http://img200.imageshack.us/img200/5129/capturadepantalla201003t.png
Input is 40mb
During the sort, the tmp files with the buckets of "greater than", "lower than" values are created and when the sort is finished, the values are sent to a file called ( guess what ) output
processing http://img200.imageshack.us/img200/1672/capturadepantalla201003y.png
Temp files are created with the partial results
Finally all the tmp files are deleted and the result is kept in the file "output" with the correct sorted sequence of numbers:
output http://img203.imageshack.us/img203/5950/capturadepantalla201003w.png
Finally the file "output" is created, notice it is 40 mb too
Here's the full program.
import java.io.*;
import java.util.*;
public class FileQuickSort {
static final int MAX_SIZE = 1024*1024*16; // 16 megabytes in this sample, the more memory your program has, less disk writing will be used.
public static void main( String [] args ) throws IOException {
fileQuickSort( new File("input"), new File("output"));
System.out.println();
}
//
static void fileQuickSort( File inputFile, File outputFile ) throws IOException {
Scanner scanner = new Scanner( new BufferedInputStream( new FileInputStream( inputFile ), MAX_SIZE));
scanner.useDelimiter(",");
if( inputFile.length() > MAX_SIZE && scanner.hasNextInt()) {
System.out.print("-");
// put them in two buckets...
File lowerFile = File.createTempFile("quicksort-","-lower.tmp",new File("."));
File greaterFile = File.createTempFile("quicksort-","-greater.tmp", new File("."));
PrintStream lower = createPrintStream(lowerFile);
PrintStream greater = createPrintStream(greaterFile);
PrintStream target = null;
int pivot = scanner.nextInt();
// Read the file and put the values greater than in a file
// and the values lower than in other
while( scanner.hasNextInt() ){
int current = scanner.nextInt();
if( current < pivot ){
target = lower;
} else {
target = greater;
}
target.printf("%d,",current);
}
// avoid dropping the pivot
greater.printf("%d,",pivot);
// close the stream before reading them again
scanner.close();
lower.close();
greater.close();
// sort each part
fileQuickSort( lowerFile , outputFile );
lowerFile.delete();
fileQuickSort( greaterFile , outputFile);
greaterFile.delete();
// And you're done.
} else {
// Else , if you have enough RAM to process it
//
System.out.print(".");
List<Integer> smallFileIntegers = new ArrayList<Integer>();
// Read it
while( scanner.hasNextInt() ){
smallFileIntegers.add( scanner.nextInt() );
}
scanner.close();
// Sort them in memory
Collections.sort( smallFileIntegers );
PrintStream out = createPrintStream( outputFile);
for( int i : smallFileIntegers ) {
out.printf("%d,",i);
}
out.close();
// And your're done
}
}
private static PrintStream createPrintStream( File file ) throws IOException {
boolean append = true;
return new PrintStream( new BufferedOutputStream( new FileOutputStream( file, append )));
}
}
The format of the files is number,number,number,number
Your current format is: n u m b e r , n u m b , b e r
To fix that you just have to read it all and skip the blanks.
Add another question for that.
Read it to memory in chunks (100 MB each?), one chunk at a time, sort it and save to disk.
Then open all the ordered chunks, read the first element of each, and append the lowest to the output. Then read the next element of the chunk you just read from and repeat.
When merging you can keep an array of the last int read from each chunk and just iterate over it to get the lowest. Then you substitute the value you just used with the next element in the chunk it was taken from.
example with chunks [1, 5, 16] [2, 9, 14] [3, 8, 10]
array [(1), 2, 3], lowest 1 --> to output
[5, (2), 3], lowest 2 --> to output
[5, 9, (3)], lowest 3 -->
[(5), 9, 8], 5
[16, 9, (8)], 8
[16, (9), 10], 9
...
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With