Hello I am very new in the Regex world. I would like to extract the timestamp, location and the "id_str" field in my test string in Java.
20110302140010915|{"user":{"is_translator":false,"show_all_inline_media":false,"following":null,"geo_enabled":true,"profile_background_image_url":"http:\/\/a3.twimg.com\/a\/1298918947\/images\/themes\/theme1\/bg.png","listed_count":0,"favourites_count":2,"verified":false,"time_zone":"Mountain Time (US & Canada)","profile_text_color":"333333","contributors_enabled":false,"statuses_count":152,"profile_sidebar_fill_color":"DDEEF6","id_str":"207356721","profile_background_tile":false,"friends_count":14,"followers_count":13,"created_at":"Mon Oct 25 04:05:43 +0000 2010","description":null,"profile_link_color":"0084B4","location":"WaKeeney, KS","profile_sidebar_border_color":"C0DEED",
I have tried this
(\d*).*?"id_str":"(\d*)",.*"location":"([^"]*)"
It has a lot of backtrack if I used the lazy quantifier .*?
(3000 steps in regexbuddy), but the number of characters between the anchor "id_str" and "location" is not always the same. Also, it could be catastrophic if no location is found in the string.
How can I avoid 1) Unnecessary backtracking?
and
2) Faster to find non-match string?
Thanks.
This looks like JSON and trust me it's pretty easy to parse it this way.
String[] input = inputStr.split("|", 2);
System.out.println("Timestamp: " + input[0]); // 20110302140010915
JSONObject user = new JSONObject(input[1]).getJSONObject("user");
System.out.println ("ID: " + user.getString("id_str")); // 207356721
System.out.println ("Location: " + user.getString("location")); // WaKeeney, KS
Reference:
JSON Java API docs
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With