Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Interrogating the Protein Data Bank from F# 3.0

This practical Haskell example encouraged me to install the Visual Studio 2012 trial in order to use F# type providers. However, I am completely stumped as to how to use it to solve that problem. There is an RCSB SOAP web service. I copied an example (which doesn't work because the web service API has changed) using the URL for the WSDL web service from the RCSB:

open Microsoft.FSharp.Data.TypeProviders

type pdb = WsdlService<"http://www.rcsb.org/pdb/services/pdbws?wsdl">

do
    let ws = pdb.Getpdbws()
    ws.getCurrentPdbIds()
    |> printfn "%A"

But this crashes at run-time with the error:

Unhandled Exception: System.InvalidOperationException: RPC Message blastPDBRequest1 in operation blastPDB1 has an invalid body name blastPDB. It must be blastPDB1
   at System.ServiceModel.Description.XmlSerializerOperationBehavior.Reflector.OperationReflector.EnsureMessageInfos()
   at System.ServiceModel.Description.XmlSerializerOperationBehavior.Reflector.EnsureMessageInfos()
   at System.ServiceModel.Description.XmlSerializerOperationBehavior.CreateFormatter()
   at System.ServiceModel.Description.XmlSerializerOperationBehavior.System.ServiceModel.Description.IOperationBehavior.ApplyClientBehavior(OperationDescription description, ClientOperation proxy)
   at System.ServiceModel.Description.DispatcherBuilder.BindOperations(ContractDescription contract, ClientRuntime proxy, DispatchRuntime dispatch)
   at System.ServiceModel.Description.DispatcherBuilder.ApplyClientBehavior(ServiceEndpoint serviceEndpoint, ClientRuntime clientRuntime)
   at System.ServiceModel.Description.DispatcherBuilder.BuildProxyBehavior(ServiceEndpoint serviceEndpoint, BindingParameterCollection& parameters)
   at System.ServiceModel.Channels.ServiceChannelFactory.BuildChannelFactory(ServiceEndpoint serviceEndpoint, Boolean useActiveAutoClose)
   at System.ServiceModel.ChannelFactory.CreateFactory()
   at System.ServiceModel.ChannelFactory.OnOpening()
   at System.ServiceModel.Channels.CommunicationObject.Open(TimeSpan timeout)
   at System.ServiceModel.ChannelFactory.EnsureOpened()
   at System.ServiceModel.ChannelFactory`1.CreateChannel(EndpointAddress address, Uri via)
   at System.ServiceModel.ChannelFactory`1.CreateChannel()
   at System.ServiceModel.ClientBase`1.CreateChannel()
   at System.ServiceModel.ClientBase`1.CreateChannelInternal()
   at System.ServiceModel.ClientBase`1.get_Channel()
   at Program.pdb.ServiceTypes.PdbWebServiceClient.getCurrentPdbIds()
   at Program.pdb.ServiceTypes.SimpleDataContextTypes.PdbWebServiceClient.getCurrentPdbIds()
   at <StartupCode$ConsoleApplication2>.$Program.main@() in c:\users\jon\documents\visual studio 11\Projects\ConsoleApplication2\ConsoleApplication2\Program.fs: line 5

Also the SOAP web service is being deprecated in favor of a RESTful one. How might I use that from F# 3.0? What does the simplest working example look like?

like image 454
J D Avatar asked Aug 19 '12 12:08

J D


People also ask

How do you get a PDB file from a protein?

-Under Download the Structure File, right click on the X where the PDB(top) meets with none, under compression (on left) in the table and save target as. - Type in name of Protein / Macromolecule ( examples at bottom of page). - Click on Retrieve Released Data Matching Your Query icon.

How do you reference a protein data bank?

A PDB structure with a corresponding publication should be referenced by PDB ID and cited using both the corresponding DOI and publication. RCSB PDB should be referenced with the URL rcsb.org and the following citation: H.M. Berman, J. Westbrook, Z.

How do I find the PDB ID of a protein?

To find it, simply enter the PDB code in the search slot found at the left of this (and every) page in Proteopedia. Proteopedia is updated once each week, shortly following the weekly new release cycle at the PDB.

How do I search PDB?

Query using the 'Advanced Search' panel The structure search options are available from the “Advanced Search” panel and can be accessed by typing in a PDB ID or RCSB.org assigned CSM ID in the box listed under Structure Similarity (Figure 1).


1 Answers

Looks like operation BlastPDB is overloaded and the underlying generation code used by the TypeProvider does not properly support that (it is not putting the "1" on the body name). See this answer for the same issue when using Svcutil directly WCF: Svcutil generates invalid client proxy, Apache AXIS Web Service, overload operations - This page shows that the Type Provider uses svcutil internally.

AFAIK you will not be able to access the REST service using a Type Provider as REST services don't provide schema (see this answer F# Type Providers and REST apis). You will likely have to fallback on a REST client library (see some options here .NET Rest Client Frameworks) or do raw HTTP.

What does the simplest working example look like?

Following is a simple example to get the the current list of PDB ids using the List all current PDB IDs REST API (I believe this is equivalent to the call you attempted with the web service). You will need to add a reference to System.Net.Http.

open System.Net.Http
open System.Threading.Tasks

[<EntryPoint>]
let main argv = 

    use httpClient = new HttpClient()
    let task = httpClient.GetStringAsync("http://www.rcsb.org/pdb/rest/getCurrent")
    printfn "%s" task.Result

    0
like image 54
bentayloruk Avatar answered Jan 04 '23 02:01

bentayloruk