Friday, March 13, 2009

NTS:: Some ideas for a Windows-friendly approach for basic indexing of OAI-PMH content (amongst other sources).

Thinking about my brief flirtation with an EAV variant, I've realised that I should probably stick with a traditional structure, (and possibly use views to produce the output I would like).

  1. Grab OAI-PMH content

  2. For each record from OAI source, store it in Postgres (temp table? - alternatively use a table for each OAI source)

  3. Create Postgres indices/tables from xpath queries... a standard xpath query wouldn't work with all OAI sources, (some include non Dublin Core content such as DIDL)

  4. Run Sphinx over select Postgres tables

  5. (Empty the Postgres temp table?)

Then can use Sphinx for other sources as well...

No comments:

Post a Comment