<p>Is it possible to get the content of a URL with PHP (using some sort of function like <code>file_get_contents</code> or <code>header</code>) but only after the execution of some JavaScript code?</p> <p>Example:</p> <p>mysite.com has a script that does <code>loadUrlAfterJavascriptExec('http://exampletogetcontent.com/')</code> and prints/echoes the content. imagine that some jQuery runs on <code>http://exampletogetcontent.com/</code> that changes DOM, and <code>loadUrlAfterJavascriptExec</code> will get the resulting HTML</p> <p>Can we do that?</p> <p>Just to be clear, what I want is to get the content of a page through a URL, but only after JavaScript runs on the target page (the one PHP is getting its content).</p> <p>I am aware PHP runs before the page is sent to the client, and JS only after that, but thought that maybe there was an expert workaround.</p>

<p><strong>Update 2</strong> Adds more details on how to use <code>phantomjs</code> from PHP.</p> <p><strong>Update 1</strong> (after clarification that javascript on <strong>target</strong> page need to run first)</p> <h3>Method 1:Use phantomjs(will execute javascript);</h3> <p><strong>1.</strong> Download phantomjs and place the executable in a path that your PHP binary can reach.</p> <p><strong>2.</strong> Place the following 2 files in the same directory:</p> <p><strong>get-website.php</strong></p> <pre class="prettyprint"><code><?php $phantom_script= dirname(__FILE__). '/get-website.js'; $response = exec ('phantomjs ' . $phantom_script); echo htmlspecialchars($response); ?> </code></pre> <p><strong>get-website.js</strong></p> <pre class="prettyprint"><code>var webPage = require('webpage'); var page = webPage.create(); page.open('http://google.com/', function(status) { console.log(page.content); phantom.exit(); }); </code></pre> <p><strong>3.</strong> Browse to <code>get-website.php</code> and the target site, <code>http://google.com</code> contents will return after executing inline javascript. You can also call this from a command line using <code>php /path/to/get-website.php</code>.</p> <h3>Method 2:Use Ajax with PHP (No phantomjs so won't run javascript);</h3> <p><strong>/get-website.php</strong></p> <pre class="prettyprint"><code><?php $html=file_get_contents('http://google.com'); echo $html; ?> </code></pre> <p><strong>test.html</strong></p> <pre class="prettyprint"><code><!doctype html> <html lang="en"> <head> <meta charset="utf-8"> <title>on demo</title> <style> p { color: red; } span { color: blue; } </style> <script src="https://code.jquery.com/jquery-1.10.2.js"></script> </head> <body> <button id='click_me'>Click me</button> <span style="display:none;"></span> <script> $( "#click_me" ).click(function () { $.get("/get-website.php", function(data) { var json = { html: JSON.stringify(data), delay: 1 }; alert(json.html); }); }); </script> </body> </html> </code></pre>

<p>I found a fantastic page on this, it's an entire tutorial on how to process the DOM of a page in PHP which is entirely created using javascript.</p> <p>https://www.jacobward.co.uk/using-php-to-scrape-javascript-jquery-json-websites/ "PhantomJS development is suspended until further notice" so that option isn't a good one.</p>

Get the content (text) of an URL after Javascript has run with PHP

Tags:

http-headers

javascript

jquery

php

curl

Is it possible to get the content of a URL with PHP (using some sort of function like file_get_contents or header) but only after the execution of some JavaScript code?

Example:

mysite.com has a script that does loadUrlAfterJavascriptExec('http://exampletogetcontent.com/') and prints/echoes the content. imagine that some jQuery runs on http://exampletogetcontent.com/ that changes DOM, and loadUrlAfterJavascriptExec will get the resulting HTML

Can we do that?

Just to be clear, what I want is to get the content of a page through a URL, but only after JavaScript runs on the target page (the one PHP is getting its content).

I am aware PHP runs before the page is sent to the client, and JS only after that, but thought that maybe there was an expert workaround.

509

asked Feb 13 '15 17:02

Victor Ferreira

3 Answers

Update 2 Adds more details on how to use phantomjs from PHP.

Update 1 (after clarification that javascript on target page need to run first)

Method 1:Use phantomjs(will execute javascript);

1. Download phantomjs and place the executable in a path that your PHP binary can reach.

2. Place the following 2 files in the same directory:

get-website.php

<?php
    
    $phantom_script= dirname(__FILE__). '/get-website.js'; 


    $response =  exec ('phantomjs ' . $phantom_script);

    echo  htmlspecialchars($response);
    ?>

get-website.js

var webPage = require('webpage');
var page = webPage.create();

page.open('http://google.com/', function(status) {
 console.log(page.content);
  phantom.exit();
});

3. Browse to get-website.php and the target site, http://google.com contents will return after executing inline javascript. You can also call this from a command line using php /path/to/get-website.php.

Method 2:Use Ajax with PHP (No phantomjs so won't run javascript);

/get-website.php

<?php
    
    $html=file_get_contents('http://google.com');
    echo $html;
    ?>

test.html

<!doctype html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>on demo</title>
<style>
p {
color: red;
}
span {
color: blue;
}
</style>
<script src="https://code.jquery.com/jquery-1.10.2.js"></script>
</head>
<body>
<button id='click_me'>Click me</button>
<span style="display:none;"></span>
<script>

$( "#click_me" ).click(function () {
    $.get("/get-website.php", function(data) {
        var json = {
            html: JSON.stringify(data),
            delay: 1
        };
        alert(json.html);
        });
});
</script>
</body>
</html>

108

answered Oct 16 '22 09:10

AndrewD

I found a fantastic page on this, it's an entire tutorial on how to process the DOM of a page in PHP which is entirely created using javascript.

https://www.jacobward.co.uk/using-php-to-scrape-javascript-jquery-json-websites/ "PhantomJS development is suspended until further notice" so that option isn't a good one.

answered Oct 16 '22 09:10

Adamantus

I think the easiest and best way is using this package https://github.com/spatie/browsershot just install it completely and use the below code

Browsershot::url('https://example.com')->bodyHtml()

answered Oct 16 '22 09:10

Mahdi mehrabi

Related questions
                            
                                How to declare private abstract method in TypeScript?
                            
                                Regex optional non-capturing groups
                            
                                API resolved without sending a response in Nextjs
                            
                                How to eliminate post-render "flicker"?
                            
                                Which popular libraries for a standalone JavaScript interpreter (jslibs)? [closed]
                            
                                Should we validate method arguments in JavaScript API's?
                            
                                JavaScript Iterator Class
                            
                                Highlight text as you type on textarea
                            
                                JavaScript Access Local Variable with Same Name in Inner and Outer Scope
                            
                                How can I encrypt a string in JavaScript and decrypt that string in C#
                            
                                Detecting whether Android device is a phone or a tablet with javascript
                            
                                Leap year check using bitwise operators (amazing speed)
                            
                                Make Scrolling NavBar Stick At Top Of Browser In Bootstrap
                            
                                How do I play arbitrary MIDI notes with javascript?
                            
                                Accessing CSS media query rules via JavaScript/DOM
                            
                                How to check if an event was prevented
                            
                                Defer inline javascript execution? [duplicate]
                            
                                How to set the iframe height & width to 100%
                            
                                Implement a Kalman filter to smooth data from deviceOrientation API
                            
                                What are the possible usage scenarios for the new JavaScript "Symbol" datatype?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Get the content (text) of an URL after Javascript has run with PHP

Tags:

http-headers

javascript

jquery

php

curl

Victor Ferreira

People also ask

3 Answers

Method 1:Use phantomjs(will execute javascript);

Method 2:Use Ajax with PHP (No phantomjs so won't run javascript);

AndrewD

Adamantus

Mahdi mehrabi

Recent Activity

Donate For Us