Disaster Management Case Study Setting up Solr-powered data search for a disaster relief organization
- Industry:Disaster Management
- Year founded:1990s
The client is a disaster management center that works with government companies and non-profit organizations across the globe. It provides early warning, risk assessment, first response, and recovery efforts coordination technology for natural and man-made disasters (floods, earthquakes, hurricanes, etc.)
- Too much disaster-related data, too many duplicates (data comes from over 5,000 sources).
- Data providers use their own maps, datasets, and indexing rules. One of the challenges is that the data is poorly organized (no meta data exists for some documents, different formats are used).
- A need to tie all data points to their geographical locations.
- Bottom line: they needed a search solution that would index these disparate datasets and allow one to find relevant docs in the sea of data for a defined area on the map.
The client uses an ArcGIS-based solution for cartography. They also have proprietary systems written around the ArcGIS core. The goal is to be able to search through those datasets for documents that are (a) relevant to the query, and (b) relevant to the specified area.
ObjectStyle built the search functionality using Apache Solr. For higher fault tolerance and availability, we used the Solr Cloud mode. Solr was deployed to Amazon Web Services, a cloud services solution, and configuration was managed via ZooKeeper.
ObjectStyle built an application for indexing relevant datasets. It relies on meta data, where possible, to put together a meaningful/searchable index.
We also created a user interface for managing the indexing process and including/excluding selected data to/from index. The UI also lets one tweak search result titles. For example, one can name hurricane search results after the areas in which they occurred or according to their official names.
Since the client also provides data to partners through API, ObjectStyle also built a REST application that would allow partners to use the search functionality on their end.
We had to fine-tune Solr to rank data by distance. Solr has a smart way of determining which docs are more relevant and giving them more “weight.” This helps the client find the right data. In addition, each data instance should be matched to an outlined geographical area - be it a city or a tsunami.
So, there are two dimensions to search: the keyword and the area radius. This is how you find relevant data for a given situation.
We handed a working search facility to the client. They are now testing the beta version.
- SolrCloud 7.3 (Zookeeper, AWS)
- (Additional) JQuery, Bootstrap
Time Span and Resources
- Duration:10 months
- Effort:1650 man-hours