I'm using Mechanize to scrape a password-protected website upon user's request. I'm trying to decouple login and search functionality by running a Rake task that logs into the site and saves the cookies into a database, which is reused by subsequent Mechanize requests.
My problem is that cookie_jar.save
method doesn't save session cookies/tokens into the cookies file. Here's a simple example that demonstrates this:
require 'mechanize'
# Setup Mechanize agents
agent1 = Mechanize.new
agent2 = Mechanize.new
# Fetch page and save cookies to local file
agent1.get ('http://www.my-secure-website.com')
agent1.post('http://www.my-secure-website.com/login', {
'user[login]' => 'my_login',
'user[password]' => 'my_password',
'submit' => 'Login'
})
# Verify and save cookies
agent1.cookie_jar.save_as 'cookies'
p agent1.cookie_jar
# #<Mechanize::CookieJar:0x8cf60b8 @jar={"www.my-secure-website.com"=>{"/"=>{"JSESSIONID"=>JSESSIONID=1NqLRc4dm0Qp5465N82Zwz4N0yXxy5jP1pXpyKp9jG8ssX2nMp5q!-334818122}, "/login/"=>{"Account"=>Account=my_account_number}}, "evr.my-secure-website.com"=>{"/APBDBQ"=>{"JSESSIONID"=>JSESSIONID=A74D230DEAFF50098557FBE76DD2E0C5}}}
########################################################
# Now let's load cookies into the second Mechanize agent
# Version 1 - This works only partially. Session cookies are missing:
agent2.cookie_jar.load 'cookies'
p agent2.cookies
# [
# [0] Account=my_account_number
# ]
p agent2.cookie_jar
# #<Mechanize::CookieJar:0x914c658 @jar={"www.my-secure-website.com"=>{"/"=>{}, "/login/"=>{"Account"=>Account=my_account_number}}, "evr.my-secure-website.com"=>{"/APBDBQ"=>{}}}>
# Version 2 - This works, but cannot be saved into file/db!
agent2.cookie_jar = agent1.cookie_jar
p agent2.cookies
# [
# [0] JSESSIONID=1NqLRc4dm0Qp5465N82Zwz4N0yXxy5jP1pXpyKp9jG8ssX2nMp5q!-334818122,
# [1] Account=my_account_number,
# [2] JSESSIONID=A74D230DEAFF50098557FBE76DD2E0C5
# ]
p agent2.cookie_jar
# #<Mechanize::CookieJar:0x8cf60b8 @jar={"www.my-secure-website.com"=>{"/"=>{"JSESSIONID"=>JSESSIONID=1NqLRc4dm0Qp5465N82Zwz4N0yXxy5jP1pXpyKp9jG8ssX2nMp5q!-334818122}, "/login/"=>{"Account"=>Account=my_account_number}}, "evr.my-secure-website.com"=>{"/APBDBQ"=>{"JSESSIONID"=>JSESSIONID=A74D230DEAFF50098557FBE76DD2E0C5}}}>
And this is what my saved cookies file looks like:
---
www.my-secure-website.com:
/: {}
/login/:
Account: !ruby/object:Mechanize::Cookie
version: 0
port:
discard:
comment_url:
expires: Thu, 22 May 2014 07:48:46 GMT
max_age:
comment:
secure: true
path: /login/
domain: www.my-secure-website.com
accessed_at: 2013-05-22 00:48:47.227628764 -07:00
created_at: 2013-05-22 00:48:47.227628764 -07:00
name: Account
value: S4633
for_domain: false
domain_name: !ruby/object:DomainName
ipaddr:
hostname: www.my-secure-website.com
uri_host: www.my-secure-website.com
tld: com
canonical_tld_p: true
domain: my-secure-website.com
session: false
evr.my-secure-website.com:
/APBDBQ: {}
You can see the session token (JSESSIONID
) in the console output, but it's missing from the local cookie file. My question is, how do I make Mechanize.cookie_jar.save_as
also save the session data?
As of Mechanize version 2.6.0. the cookie_jar.save_as
method allows passing of :session
option to enable saving session cookies as well:
agent1.cookie_jar.save_as 'cookies', :session => true, :format => :yaml
P.S. I was using v2.5.1, which lacked this functionality.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With