Well, I'm not an elasticsearch expert, not at all. And, as usual, my english sucks.
Let's say that you have indexed a bunch of stuff, and now you need to change an index type.
For instance, in my case, I've seen that a filed named "hostname" was splitted if it was containing a dash, like "tc-pi.pacs.mydomain" was splitted in two parts when creating graphs using kibana.
The solution, in order to avoid this hostname splitting, is to define "index" : "not_analyzed" in logstash mapping.
Well, reading around it is not possible to change mappings once the document were indexed.
So a solution, a workaround, thanks to this post is the following.
"Download" the old index
curl -XGET 'http://127.0.0.1:9200/dcmaudit/_mappings/'
Copy the result in a text editor for your convenience, then change the mapping, like
...
"hostname":{"type":"string", "index" : "not_analyzed"}
...
Create a new index:
curl -XPOST http://localhost:9200/dcmaudit2 -d '{"mappings":{"logs":{"properties":{"@timestamp":{"type":"date","format":"dateOptionalTime"},"@version":{"type":"string"},"ParticipantObjectIdentification2.ParticipantObjectTypeCode.displayName":{"type":"string", "index" : "not_analyzed"},"hostname":{"type":"string", "index" : "not_analyzed"},"message":{"type":"string"},"tags":{"type":"string"},"timestamp":{"type":"date","format":"dateOptionalTime"}}}}}'
Now let's create a logstash configuration file like this:
input {
# We read from the "old" index
elasticsearch {
hosts => [ "localhost" ]
port => "9200"
index => "dcmaudit"
size => 500
scroll => "5m"
docinfo => true
}
}
filter {
mutate {
remove_field => [ "@timestamp", "@version" ]
}
}
output {
elasticsearch {
host => "localhost"
port => "9200"
protocol => "http"
index => "dcmaudit2"
index_type => "%{[@metadata][_type]}"
document_id => "%{[@metadata][_id]}"
}
stdout {
codec => rubydebug
}
}
Launch logstash
./bin/logstash -f conf.json
Now all the stuff from one index (dcmaudit) will be copied to the new one (dcmaudit2).
At this point you can delete the old index.
curl -XDELETE localhost:9200/dcmaudit
If you want, and if you need it, you can run this task again, recreating the old index name (dcmaudit) but whit the new mapping, and then repeat the logstash task changing the input and the output index accordingly.
No comments:
Post a Comment