Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Open Source Profiling Frameworks?

Have you ever wanted to test and quantitatively show whether your application would perform better as a static build or shared build, stripped or non-stripped, upx or no upx, gcc -O2 or gcc -O3, hash or btree, etc etc. If so this is the thread for you. There are hundreds of ways to tune an application, but how do we collect, organize, process, visualize the consequences of each experiment.

I have been looking for several months for an open source application performance engineering/profiling framework similar in concept to Mozilla's Perftastic where I can develop/build/test/profile hundreds of incarnations of different tuning experiments.

Some requirements:

Platform

SUSE32 and SUSE64

Data Format

Very flexible, compact, simple, hierarchical. There are several possibilities including

  • Custom CSV
  • RRD
  • Protocol Buffers
  • JSON
  • No XML. There is lots of data and XML is tooo verbose

Data Acquisition

Flexible and Customizable plugins. There is lots of data to collect from the application including performance data from /proc, sys time, wall time, cpu utilization, memory profile, leaks, valgrind logs, arena fragmentation, I/O, localhost sockets, binary size, open fds, etc. And some from the host system. My language of choice for this is Python, and I would develop these plugins to monitor and/or parse data in all different formats and store them in the data format of the framework.

Tagging

All experiments would be tagged including data like GCC version and compile options, platform, host, app options, experiment, build tag, etc.

Graphing

History, Comparative, Hierarchical, Dynamic and Static.

  • The application builds are done by a custom CI sever which releases a new app version several times per day the last 3 years straight. This is why we need a continuous trend analysis. When we add new features, make bug fixes, change build options, we want to automatically gather profiling data and see the trend. This is where generating various static builds is needed.
  • For analysis Mozilla dynamic graphs are great for doing comparative graphing. It would be great to have comparative graphing between different tags. For example compare N build versions, compare platforms, compare build options, etc.
  • We have a test suite of 3K tests, data will be gathered per test, and grouped from inter-test data, to per test, to per tagged group, to complete regression suite.
  • Possibilities include RRDTool, Orca, Graphite

Analysis on a grouping basis

  • Min
  • Max
  • Median
  • Avg
  • Standard Deviation
  • etc

Presentation

All of this would be presented and controlled through a app server, preferably Django or TG would be best.

Inspiration

  • Centreon
  • Cacti
like image 321
Gregory Avatar asked Oct 22 '08 07:10

Gregory


People also ask

What are the three types of data profiling?

What are the different kinds of data profiling? Many of the data profiling techniques or processes used today fall into three major categories: structure discovery, content discovery and relationship discovery.

Is ataccama open source?

Ataccama offers users several free data tools available for download.

Which tool are used for data profiling?

SAP Business Objects Data Services (BODS) for Data Profiling One of the best DF tools and ETL software solutions, SAP BODS allows users to quickly identify data inconsistencies and problems before turning them into business intelligence and actionable insights.


1 Answers

There was a talk at PyCon this week discussing the various profiling methods on Python today. I don't think anything is as complete as what your looking for, but it may be worth a look. http://us.pycon.org/2009/conference/schedule/event/15/

You should be able to find the actual talk later this week on blip.tv http://blip.tv/search?q=pycon&x=0&y=0

like image 170
PKKid Avatar answered Oct 13 '22 03:10

PKKid