Building the Index

This post will provide a high-level view of the indexing approach used by the DURA remote data source.

The DURA remote data source is simply a standalone elasticsearch server running somewhere in digitalocean.

General data model

A dura object has a few key properties:

  1. title
  2. short description
  3. type –> the corresponding class in the dura data model
  4. json description

To get search to work, I should add additional tags that enable me to slice across different facets.

For example, I would like to predefine queries for:

  • Species

For species, I would return any matching content type that is from an organism matching the species string.

  • General content type

For general content type, this would allow me to search for ‘all electron microscopy volumes’ or ‘all zoomify images’ or ‘all course lists’.

Thinking aloud

Adding relational search via graph-based mappings, while fun, is beyond the difficulty level of what I’m trying for …Although, it would open up the opportunity to do some great mappings, and possibly give me some overlap between the INCF NI-DM and DURA.

Everything on the collections page will be a search. Everything in the defaults.json generated by ipy will be a search query, description, and icon.

Steps to replace the current defaults

  1. Index the defaults in elasticsearch
  2. Write pre-canned query
  3. Write pre-canned query generator
  4. Rebuild defaults.json