Installation

Requirements

Our projects

  • hadoopy (doc): Cython based Hadoop library for Python. Efficient, simple, and powerful.
  • picarus_takeout: C/C++ module that contains the core picarus algorithms, separate so that it can be built as a standalone executable.

Third party

Useful Tools

These are all optional, but you may find them useful. Our projects (ordered by relevance)

  • hadoopy_flow: Hadoopy monkey patch library to perform automatic job-level parallelism.
  • vision_data: Library of computer vision dataset interfaces with standardized output formats.
  • image_server: Server that displays all images in the current directory as a website (very convenient on headless servers).
  • static_server: Server that allows static file access to the current directory.
  • pycassa_server: Pycassa viewer.
  • vision_results: Library HTML and Javascript tools to display computer vision results.
  • hadoop_log: Tool to scrape Hadoop jobtracker logs and provide stderr output (simplifies debugging).
  • pyram: Tiny parameter optimization library (useful when tuning up algorithms).
  • mturk_vision: Mechanical turk scripts.