Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is there a good method for parsing the user-agent string?

I have a Java module that receives the User-Agent string from an end user's browser needs to behave slightly differently depending on the type of browser, the version of the browser and maybe even the operating system. E.g.: {"FireFox", "7.0", "Win7"}, {"Safari", "3.2", "iOS9"}

I understood that the User-Agent string can vary in its format for the exact same configuration due to different plug-in installations etc.

My questions:

  1. Is the structure of the User-Agent well defined? If yes - where can I find it exactly? (From my understanding of the RFC there is not much standardization here).
  2. Assuming the question for #1 is No - is there a proper way to parse it to get the info I need?
  3. Is there a better way to get the info I need other than the User-Agent string?

Important note - I'm talking about a web-app, so my data collection abilities are limited to javascript.

like image 825
RonK Avatar asked Oct 17 '11 16:10

RonK


People also ask

How do you parse a user agent string in Python?

from wmclient import * try: client = WmClient. create("http", "localhost", 8080, "") : ua = "Mozilla/5.0 (Linux; Android 7.1. 1; ONEPLUS A5000 Build/NMF26X) AppleWebKit/537.36 (KHTML, like Gecko) " \ "Chrome/56.0. 2924.87 Mobile Safari/537.36 " client.

What is user agent parsing?

This information, gleaned directly from the User-Agent string itself (a process known as User-Agent parsing) typically includes browser, web rendering engine, operating system and device. Deeper information can be returned when the User-Agent string is mapped to an additional set of data about the underlying device.

What is a user agent string?

A browser's User-Agent string (UA) helps identify which browser is being used, what version, and on which operating system. When feature detection APIs are not available, use the UA to customize behavior or content to specific browser versions.

Can you spoof a user agent?

The process is called user-agent spoofing. Yes, when a browser or any client sends a different user-agent HTTP header from what they are and fake it that is called spoofing.


2 Answers

Have a look at the Java library I wrote for this purpose: Yauaa

I made a very simple servlet where you can try it out to see if it gives the answers you are looking for: https://try.yauaa.basjes.nl/

It is Apache 2 licensed and published into Maven so using it in a Java application is really easy. It is currently used in production on one of the busiest websites of the Netherlands (where I work).

See this blog about this https://techlab.bol.com/making-sense-user-agent-string/

like image 50
Niels Basjes Avatar answered Sep 19 '22 18:09

Niels Basjes


For Java, take a look at User-Agent-Utils. It's fairly compact (< 50kB) and has no dependencies.

Note although the latest release is quite recent (1.21, released 2018-01-24), the library's page states:

Warning: This project is end-of-life and will not be updated regularly any longer

And on the github page it says:

EOL WARNING

This library has reached end-of-life and will not see regular updates any longer.

Version 1.21 was the last official release in 2018.

like image 44
Ted Hopp Avatar answered Sep 19 '22 18:09

Ted Hopp