Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

iterating through a large 20+ gb file from a server with python

Hi I have about 16 20+ gb files on a server that I need to read specific entries from, I have the code working that reads the file in the correct order if I have one of the files saved on my computer

f = open('biodayk1.H2009', 'rb')

lbl = array.array('f')

bio = 0 for day in range(iday):
    f.seek(nx*ny*km*bio*4, 1)
    lbl.read(f, nx*ny*km)    #reads the desired ibio
    f.seek(nx*ny*km*(10 - bio)*4, 1) #skips the next ibios 
f.close()

Now I need to read the files from the server without downloading each file. I was looking into paramiko and was able to connect to the server but I'm not quite sure how to iterate through a file and just return what I want. If you need any more info or need me to answer any questions please ask. Thanks in advance.

like image 315
pter Avatar asked Jul 16 '12 19:07

pter


2 Answers

You're... in for pain. I recommend you follow the rsync route and write a script that runs on the server which serves up the bytes you're interested in. You can communicate with it via a text channel created by paramiko.

like image 106
Ignacio Vazquez-Abrams Avatar answered Oct 04 '22 03:10

Ignacio Vazquez-Abrams


I'd recommend execnet to run a bit of Python (a local function or module) remotely.

No setup required.

like image 33
Dima Tisnek Avatar answered Oct 04 '22 04:10

Dima Tisnek