I have a few documents in my ElasticSearch v1.2.1 like:
{
  "tempSkipAfterSave": "false",
  "variation": null,
  "images": null,
  "name": "Dolce & Gabbana Short Sleeve Coat",
  "sku": "MD01575254-40-WHITE",
  "user_id": "123foo",
  "creation_date": null,
  "changed": 1
}
where sku can be a variation such as : MD01575254-40-BlUE, MD01575254-38-WHITE
I can get my elastic search query to work with this:
{
  "size": 1000,
  "from": 0,
  "filter": {
    "and": [
      {
        "regexp": {
          "sku": "md01575254.*"
        }
      },
      {
        "term": {
          "user_id": "123foo"
        }
      },
      {
        "missing": {
          "field": "project_id"
        }
      }
    ]
  },
  "query": {
    "match_all": {}
  }
}    
I got all the variations back of sku: MD01575254* 
However, the dash '-' is really screwing me up
when I change the regexp to:
"regexp": {
  "sku": "md01575254-40.*"
}
I can't get any results back. I've also tried
Just can't seem to make it work ? What am I don't wrong here?
Problem:
This is because the default analyzer usually tokenizes at -, so your field is most likey saved like:
MD0157525440BlUESolution:
You can update your mapping to have a sku.raw field that would not be analyzed when indexed.  This will require you to delete and re-index.
{
  "<type>" : {
    "properties" : {
      ...,
      "sku" : {
        "type": "string",
        "fields" : {
          "raw" : {"type" : "string", "index" : "not_analyzed"}
        }
      }
    }
  }
}
Then you can query this new field which is not analyzed:
{
  "query" : {
    "regexp" : {
      "sku.raw": "md01575254-40.*"
    }
  }
}
HTTP Endpoints:
The API to delete your current mapping and data is:
DELETE http://localhost:9200/<index>/<type>
The API to add your new mapping, with the raw SKU is:
PUT http://localhost:9200/<index>/<type>/_mapping
Links:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With