Indexing and mapping log data using solr 6

Question

Currently I am using solr 6 and I want to index log data like those shown below:

2016-06-22T03:00:04Z|INFO|ip-10-11-0-241|1301|DreamRocket.Game.ServiceInterface.GameCredentialsAuthProvider|DreamRocket.Game.ServiceInterface.GameCredentialsAuthProvider.CheckValidGameDataRequestFilter|Invalid UserAgent=%E3%83%94%E3%82%B3/1.07.41149 CFNetwork/758.2.8 Darwin/15.0.0, PlayerId=player_a2a7d1a4-0a31-4c4d-b5bf-10be67dc85d6|

I am unsure how to separate the data via pipe. the layout I use in Nlog is this.

${date:universalTime=True:format=yyyy-MM-ddTHH\:mm\:ssZ}|${level:uppercase=true}|${machinename}|${processid}|${logger}|${callsite:className=true:methodName=true}|${message}|${exception:format=tostring}${newline}

And I tried to use CSV upload but solr gives me the below json return. Not conductive to do queries. Please help

  "responseHeader":{
    "status":0,
    "QTime":77,
    "params":{
      "q":"*:*",
      "indent":"on",
      "wt":"json",
      "_":"1466745065000"}},
  "response":{"numFound":8,"start":0,"docs":[
      {
        "id":"b28049bb-d49e-4b4d-80db-d7d77351527b",
        "2016-06-23T02_37_18Z_INFO_web.chubi.development1_6326_DreamRocket.Game.ServiceInterface.GameCredentialsAuthProvider_DreamRocket.Game.ServiceInterface.GameCredentialsAuthProvider.CheckValidGameDataRequestFilter_Invalid_UserAgent_PIKO_0.00.41269_CFNetwork_711.5.6_Darwin_14.0.0":["2016-06-23T02:37:28Z|INFO|web.chubi.development1|6326|DreamRocket.Game.ServiceInterface.GameCredentialsAuthProvider|DreamRocket.Game.ServiceInterface.GameCredentialsAuthProvider.CheckValidGameDataRequestFilter|Invalid UserAgent=PIKO/0.00.41269 CFNetwork/711.5.6 Darwin/14.0.0"],
        "_PlayerId_player_407defcf-7032-4ef4-81a6-91bb62b9150b_":[" PlayerId=player_905266b2-9ce3-4fa1-b0a7-4663b9509731|"],
        "_version_":1537919142165741568}]}

Saurabh Chaturvedi · Accepted Answer

Looks like you want to extract Clean data out of the logs that can be indexed and searched without any ambiguity. Why don't you try to analyze your data using creating a custom Analyzer that uses a Regex for filtering out the data for you. I would strongly suggest solr.PatternTokenizerFactory to remove pipe character from your Text . Also , you can use Analysis tab in solr for an exhaustive analysis that how your log data has been treated by Analyzer . For the encoded text, like in Invalid UserAgent field you can use ASCII Folding filter factory for indexing encoded characters . And you may need to tokenize data at dots also, i don't know whether that's your requirement or not . In your data, PatternTokenizer does the trick, and if you still need to do further refinements , you may use solr.WordDelimeter to tune your index better . May be I'll edit this solution with some Analyzer settings for you :)

Indexing and mapping log data using solr 6

Tags:

solr6

Moses Liao GZ

1 Answers

Saurabh Chaturvedi

Recent Activity

Donate For Us

Indexing and mapping log data using solr 6

Tags:

solr6

Moses Liao GZ

1 Answers

Saurabh Chaturvedi

Related questions

Recent Activity

Donate For Us