Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

XDMP-TOOBIG error occurs while using xdmp:http-post

I have an xquery file which returns more than 2.2GB text data. When I hit the xquery file directly in the browser(Chrome) it loads all the text data.

But when I try to make a post call to that xquery file using xdmp:http-post($url,$options) it throws XDMP-TOOBIG error. Below is the trace.

XDMP-TOOBIG: xdmp:http-post("http://server:8278/services/getText...", <options xmlns="xdmp:http"><timeout>600000</timeout><authentication method="basic"><usernam...</options>) -- Document size exceeds text document size limit of 2048 megabytes
in /services/invoke.xqy, at 20:7 [1.0-ml]
$HTTP_CALL = <configurations xmlns:config="" xmlns=""><credentails><username>admin</username><password>admin</password...</configurations>
$userName = text{"admin"}
$password = text{"admin"}
$timeOut = text{"600000"}
$url = "http://server:8278/services/getText..."
$responseType = "text/plain"
$options = <options xmlns="xdmp:http"><timeout>600000</timeout><authentication method="basic"><usernam...</options>
$response = xdmp:http-post("http://server:8278/services/getText...", <options xmlns="xdmp:http"><timeout>600000</timeout><authentication method="basic"><usernam...</options>)
$set-reponse-type = ()

Any limit I can specify in the file where I used the xdmp:http-post or any other solutions?

Help is appreciated.

like image 693
Karthick Avatar asked Nov 08 '22 16:11

Karthick


1 Answers

When using HTTP to call an outside server from within MarkLogic, the result must fit into memory, possibly multiple copies depending on what you do. Text Variables are not optimized for extremely large data. Depending on the details of your remote service, you can accommodate large data by using paginated HTTP requests (using Range Request Headers)

Even if the 2G limit were removed, performance would be poor and unreliable: using single HTTP requests to transfer large amounts of data becomes increasingly unreliable as any severe networking errors require a full retry.

Alternatively, the service or a local proxy service could be augmented to store the data in a shared location such as a mounted fileysystem or S3 and returning a reference to the data instead of its body. Then xdmp:filesystem-xxx and xdmp:binary-xxx functions can be used to access the data.

Once in memory, manipulating large text data as single strings will be problematic as well. If you need to access a single large object then binary documents (internal or external) can be used for better reliability.

If the HTTP request can be converted to using GET not POST then xdmp:document-load may be used to directly stream the results into a document.

Comments on the documentation for xdmp:document-load suggest one can use "rest:" uri prefix for POST or GET to stream results directly to the database, although I don't know how one passes a POST in this way.

like image 126
DALDEI Avatar answered Jan 04 '23 02:01

DALDEI