Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

summary of all bq jobs

Is there a way to list all job id's using bq command line tool for a given timeframe? What I need to do is to loop through all Id's and find if there is any error.

I use the web interface to know the job id and then use the command:

bq show -j --format=prettyjson job_id

Later I would manually copy paste the "error" part of the output. This takes a lot of time to report the job summary for a given day.

like image 731
shantanuo Avatar asked Sep 20 '12 06:09

shantanuo


2 Answers

Sure, you can list up to the last 1,000 jobs for a project you have access to by running:

bq  ls -j --max_results=1000 project_number

If you have more than 1,000 jobs, you can also write a Python script to list all jobs by paging through results in batches of 1,000 - like so:

import httplib2
import pprint
import sys

from apiclient.discovery import build
from apiclient.errors import HttpError

from oauth2client.client import AccessTokenRefreshError
from oauth2client.client import OAuth2WebServerFlow
from oauth2client.client import flow_from_clientsecrets
from oauth2client.file import Storage
from oauth2client.tools import run


# Enter your Google Developer Project number
PROJECT_NUMBER = 'XXXXXXXXXXXX'

FLOW = flow_from_clientsecrets('client_secrets.json',
                               scope='https://www.googleapis.com/auth/bigquery')



def main():

  storage = Storage('bigquery_credentials.dat')
  credentials = storage.get()

  if credentials is None or credentials.invalid:
    credentials = run(FLOW, storage)

  http = httplib2.Http()
  http = credentials.authorize(http)

  bigquery_service = build('bigquery', 'v2', http=http)
  jobs = bigquery_service.jobs()

  page_token=None
  count=0

  while True:
    response = list_jobs_page(jobs, page_token)
    if response['jobs'] is not None:
      for job in response['jobs']:
        count += 1
        print '%d. %s\t%s\t%s' % (count,
                                  job['jobReference']['jobId'],
                                  job['state'],
                                  job['errorResult']['reason'] if job.get('errorResult') else '')
    if response.get('nextPageToken'):
      page_token = response['nextPageToken']
    else:
      break


def list_jobs_page(jobs, page_token=None):
  try:
    jobs_list = jobs.list(projectId=PROJECT_NUMBER,
                          projection='minimal',
                          allUsers=True,
                          maxResults=1000,
                          pageToken=page_token).execute()

    return jobs_list

  except HttpError as err:
    print 'Error:', pprint.pprint(err.content)


if __name__ == '__main__':
  main()
like image 72
Michael Manoochehri Avatar answered Sep 27 '22 19:09

Michael Manoochehri


The following shell script is close to what I need to report.

#!/bin/sh
bq ls  -j `bq show | grep ^Project | awk '{print $2}'` | grep "`date +'%d %b'`" | awk '{print $1}' > tosave.txt

for myjob in `cat tosave.txt`
do
bq ls  -j `bq show | grep ^Project | awk '{print $2}'` | grep $myjob

bq show --format=prettyjson -j $myjob | grep -C2 "message" | head

done
like image 37
shantanuo Avatar answered Sep 27 '22 20:09

shantanuo