Skip to content

Custom outputting for pupa data#299

Closed
doubleswirve wants to merge 81 commits into
opencivicdata:masterfrom
doubleswirve:custom-export
Closed

Custom outputting for pupa data#299
doubleswirve wants to merge 81 commits into
opencivicdata:masterfrom
doubleswirve:custom-export

Conversation

@doubleswirve
Copy link
Copy Markdown

Apologies if this is a little rough around the edges (not sure what the etiquette is on some of this stuff), but I wanted to submit this as a work-in-progress PR to get feedback.

This PR allows pupa data to be sent to other targets besides being written to a file. This initially includes an output option for Google Cloud Pub/Sub (thanks @showerst for initial implementation), but could also be extended to additional targets/services (e.g., Kafka).

The basic idea is we hook into the __init__ method of the Scraper class, and set up an instance variable output_target (which defaults to self for the default file writing). Based on the specified OUTPUT_TARGET environment variable, we call the save_object method on either the default Scraper instance or the alternative output target instance (e.g., Pub/Sub instance). So the only requirement is to have the alternative output target class implement a save_object method.

So far this works pretty well; however, a couple redundant spots include:

  • obj.pre_save call
  • info/debug logging prior to writing to file/sending to service/etc
  • object validation (i.e., obj.validate())
  • obj._related iterating/saving

Seems like some of these could be moved to methods in the Scraper class so alternative output target classes wouldn't need to include them.

We probably need some unit testing in there as well. Anyway, open to ideas and look forward to getting your feedback. Thanks!

doubleswirve and others added 28 commits January 18, 2018 23:33
…date meta data from google pubsub as it is already part of the message object (during subscription)
Google Pub/Sub env var adjustments and helper methods
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants