seesaw Package

seesaw Package

ArchiveTeam seesaw kit

config Module

Configuration value manipulation.

class seesaw.config.ConfigInterpolation(s, c)[source]

Bases: object

realize(item)[source]
class seesaw.config.ConfigValue(name, title='', description='', default=None, editable=True, advanced=True)[source]

Bases: object

Configuration value validator.

The collection methods are useful for providing user configurable settings at run time. For example, when a pipeline file is executed by the warrior, the additional config values are presented in the warrior configuration panel.

check_value(value)[source]
collector = None
convert_value(value)[source]
is_valid()[source]
realize(dummy)[source]
set_value(value)[source]
classmethod start_collecting()[source]
classmethod stop_collecting()[source]
class seesaw.config.NumberConfigValue(*args, **kwargs)[source]

Bases: seesaw.config.ConfigValue

check_value(value)[source]
convert_value(value)[source]
class seesaw.config.StringConfigValue(*args, **kwargs)[source]

Bases: seesaw.config.ConfigValue

check_value(value)[source]
seesaw.config.realize(v, item=None)[source]

Makes objects contain concrete values from an item.

A silly example:

class AddExpression(object):
    def realize(self, item):
        return = item['x'] + item['y']

pipeline = Pipeline(ComputeMath(AddExpression()))

In the example, we want to compute an addition expression. The values are defined in the Item.

event Module

Actor model.

class seesaw.event.Event[source]

Bases: object

Lightweight event system.

Example:

my_event_system = Event()
my_event_system = my_listener_callback_function
my_event_system(my_event_data)
fire(*args, **kargs)[source]
getHandlerCount()[source]
handle(handler)[source]
unhandle(handler)[source]

externalprocess Module

Running subprocesses asynchronously.

class seesaw.externalprocess.AsyncPopen(*args, **kwargs)[source]

Bases: object

Asynchronous version of subprocess.Popen.

Deprecated.

classmethod ignore_sigint()[source]
run()[source]
class seesaw.externalprocess.AsyncPopen2(*args, **kwargs)[source]

Bases: object

Adapter for the legacy AsyncPopen

run()[source]
stdin
class seesaw.externalprocess.CurlUpload(target, filename, connect_timeout='60', speed_limit='1', speed_time='900', max_tries=None)[source]

Bases: seesaw.externalprocess.ExternalProcess

Upload with Curl process runner.

class seesaw.externalprocess.ExternalProcess(name, args, max_tries=1, retry_delay=30, accept_on_exit_code=None, retry_on_exit_code=None, env=None)[source]

Bases: seesaw.task.Task

External subprocess runner.

enqueue(item)[source]
handle_process_error(exit_code, item)[source]
handle_process_result(exit_code, item)[source]
on_subprocess_end(item, returncode)[source]
on_subprocess_stdout(pipe, item, data)[source]
process(item)[source]
stdin_data(item)[source]
class seesaw.externalprocess.RsyncUpload(target, files, target_source_path='./', bwlimit='0', max_tries=None, extra_args=None)[source]

Bases: seesaw.externalprocess.ExternalProcess

Upload with Rsync process runner.

stdin_data(item)[source]
class seesaw.externalprocess.WgetDownload(args, max_tries=1, accept_on_exit_code=None, retry_on_exit_code=None, env=None, stdin_data_function=None)[source]

Bases: seesaw.externalprocess.ExternalProcess

Download with Wget process runner.

stdin_data(item)[source]
seesaw.externalprocess.cleanup()[source]

item Module

Managing work units.

class seesaw.item.Item(pipeline, item_id, item_number, properties=None, keep_data=False, prepare_data_directory=True)[source]

Bases: object

A thing, or work unit, that needs to be downloaded.

It has properties that are filled by the Task.

An Item behaves like a mutable mapping.

Note

State belonging to a item should be stored on the actual item itself. That is, do not store variables onto a Task unless you know what you are doing.

class TaskStatus[source]

Bases: object

completed = 'completed'
failed = 'failed'
running = 'running'
Item.cancel()[source]
Item.clear_data_directory()[source]
Item.complete()[source]
Item.description()[source]
Item.fail()[source]
Item.get(key)[source]
Item.log_error(task, *args)[source]
Item.log_output(data, full_line=True)[source]
Item.prepare_data_directory()[source]
Item.set_task_status(task, status)[source]
class seesaw.item.ItemInterpolation(s)[source]

Bases: object

Formats a string using the percent operator during realize().

realize(item)[source]
class seesaw.item.ItemValue(key)[source]

Bases: object

Get an item’s value during realize().

fill(item, value)[source]
realize(item)[source]

pipeline Module

class seesaw.pipeline.Pipeline(*tasks)[source]

Bases: object

The sequence of steps that complete a Task.

Your pipeline will probably be something like this:

  1. Request an assignment from the tracker.
  2. Run Wget to download the file.
  3. Upload the downloaded file with rsync.
  4. Tell the tracker that the assignment is done.
add_task(task)[source]
cancel_items()[source]
enqueue(item)[source]
ui_task_list()[source]

project Module

Project information.

class seesaw.project.Project(title=None, project_html=None, utc_deadline=None)[source]

Bases: object

Briefly describes a project metadata.

This class defines the title of the project, a short description with an optional project logo and an optional deadline. The information will be shown in the web interface when the project is running.

data_for_json()[source]

runner Module

Pipeline execution.

class seesaw.runner.Runner(stop_file=None, concurrent_items=1, max_items=None, keep_data=False)[source]

Bases: object

Executes and manages the lifetime of Pipeline instances.

add_items()[source]
check_stop_file()[source]
is_active()[source]
keep_running()[source]
set_current_pipeline(pipeline)[source]
should_stop()[source]
start()[source]
stop_file_changed()[source]
stop_file_mtime()[source]
stop_gracefully()[source]
class seesaw.runner.SimpleRunner(pipeline, stop_file=None, concurrent_items=1, max_items=None, keep_data=False)[source]

Bases: seesaw.runner.Runner

Executes a single class:Pipeline instance.

forced_stop()[source]
start()[source]

task Module

Managing steps in a work unit.

class seesaw.task.ConditionalTask(condition_function, inner_task)[source]

Bases: seesaw.task.Task

Runs a task optionally.

enqueue(item)[source]
fill_ui_task_list(task_list)[source]
class seesaw.task.LimitConcurrent(concurrency, inner_task)[source]

Bases: seesaw.task.Task

Restricts the number of tasks of the same type that can be run at once.

enqueue(item)[source]
fill_ui_task_list(task_list)[source]
class seesaw.task.PrintItem[source]

Bases: seesaw.task.SimpleTask

Output the name of the Item.

process(item)[source]
class seesaw.task.SetItemKey(key, value)[source]

Bases: seesaw.task.SimpleTask

Set a value onto a task.

process(item)[source]
class seesaw.task.SimpleTask(name)[source]

Bases: seesaw.task.Task

A subclassable Task that should do one small thing well.

Example:

class MyTask(SimpleTask):
    def process(self, item):
        item['my_message'] = 'hello world!'
enqueue(item)[source]
process(item)[source]
class seesaw.task.Task(name)[source]

Bases: object

A step in the download process of an Item.

complete_item(item)[source]
fail_item(item)[source]
fill_ui_task_list(task_list)[source]
start_item(item)[source]
task_cwd(*args, **kwds)[source]

tracker Module

Contacting the work unit server.

A Tracker refers to the Universal Tracker (https://github.com/ArchiveTeam/universal-tracker).

class seesaw.tracker.GetItemFromTracker(tracker_url, downloader, version=None)[source]

Bases: seesaw.tracker.TrackerRequest

Get a single work unit information from the Tracker.

data(item)[source]
process_body(body, item)[source]
class seesaw.tracker.PrepareStatsForTracker(defaults=None, file_groups=None, id_function=None)[source]

Bases: seesaw.task.SimpleTask

Apply statistical values on the item.

process(item)[source]
class seesaw.tracker.SendDoneToTracker(tracker_url, stats)[source]

Bases: seesaw.tracker.TrackerRequest

Inform the Tracker the work unit has been completed.

data(item)[source]
process_body(body, item)[source]
class seesaw.tracker.TrackerRequest(name, tracker_url, tracker_command, may_be_canceled=False)[source]

Bases: seesaw.task.Task

Represents a request to a Tracker.

DEFAULT_RETRY_DELAY = 60
data(item)[source]
enqueue(item)[source]
handle_response(item, response)[source]
increment_retry_delay(max_delay=300)[source]
process_body(body, item)[source]
reset_retry_delay()[source]
schedule_retry(item, message='')[source]
send_request(item)[source]
class seesaw.tracker.UploadWithTracker(tracker_url, downloader, files, version=None, rsync_target_source_path='./', rsync_bwlimit='0', rsync_extra_args=[], curl_connect_timeout='60', curl_speed_limit='1', curl_speed_time='900')[source]

Bases: seesaw.tracker.TrackerRequest

Upload work unit results.

One of the inner task is used depending on the Tracker’s response to where to upload:

  • RsyncUpload
  • CurlUpload
data(item)[source]
process_body(body, item)[source]

util Module

Miscellaneous functions.

seesaw.util.find_executable(name, version, paths, version_arg='-V')[source]

Returns the path of a matching executable.

seesaw.util.test_executable(name, version, path, version_arg='-V')[source]

Try to run an executable and check its version.

seesaw.util.unique_id_str()[source]

Returns a unique string suitable for IDs.

warrior Module

The warrior server.

The warrior phones home to Warrior HQ (https://github.com/ArchiveTeam/warrior-hq).

class seesaw.warrior.BandwidthMonitor(device)[source]

Bases: object

Extracts the bandwidth usage from the system stats.

current_stats()[source]
devre = <_sre.SRE_Pattern object>
update()[source]
class seesaw.warrior.ConfigManager(config_file)[source]

Bases: object

Manages the configuration.

add(config_value)[source]
all_valid()[source]
editable_values()[source]
load()[source]
remove(name)[source]
save()[source]
set_value(name, value)[source]
class seesaw.warrior.Warrior(projects_dir, data_dir, warrior_hq_url, real_shutdown=False, keep_data=False)[source]

Bases: object

The warrior god object.

class Status[source]

Bases: object

INVALID_SETTINGS = 'INVALID_SETTINGS'
NO_PROJECT = 'NO_PROJECT'
REBOOTING = 'REBOOTING'
RESTARTING_PROJECT = 'RESTARTING_PROJECT'
RUNNING_PROJECT = 'RUNNING_PROJECT'
SHUTTING_DOWN = 'SHUTTING_DOWN'
STARTING_PROJECT = 'STARTING_PROJECT'
STOPPING_PROJECT = 'STOPPING_PROJECT'
SWITCHING_PROJECT = 'SWITCHING_PROJECT'
UNINITIALIZED = 'UNINITIALIZED'
Warrior.bandwidth_stats()[source]
Warrior.check_project_has_update(*args, **kwargs)[source]
Warrior.clone_project(project_name, project_path)[source]
Warrior.collect_install_output(data)[source]
Warrior.find_lat_lng()[source]
Warrior.fire_status()[source]
Warrior.forced_reboot()[source]
Warrior.forced_stop()[source]
Warrior.handle_lat_lng(response)[source]
Warrior.handle_runner_finish(runner)[source]
Warrior.install_project(*args, **kwargs)[source]
Warrior.keep_running()[source]
Warrior.load_pipeline(pipeline_path, context)[source]
Warrior.max_age_reached()[source]
Warrior.reboot_gracefully()[source]
Warrior.schedule_forced_reboot()[source]
Warrior.select_project(*args, **kwargs)[source]
Warrior.start()[source]
Warrior.start_selected_project(*args, **kwargs)[source]
Warrior.stop_gracefully()[source]
Warrior.update_project(*args, **kwargs)[source]
Warrior.update_warrior_hq(*args, **kwargs)[source]
Warrior.warrior_status()[source]

web Module

The warrior web interface.

class seesaw.web.ApiHandler(application, request, **kwargs)[source]

Bases: tornado.web.RequestHandler

Processes API requests.

get(command)[source]
get_template_path()[source]
initialize(warrior=None, runner=None)[source]
post(command)[source]
class seesaw.web.IndexHandler(application, request, **kwargs)[source]

Bases: tornado.web.RequestHandler

Shows the index.html.

get()[source]
class seesaw.web.ItemMonitor(item)[source]

Bases: object

Pushes item states and information to the client.

handle_item_cancel(item)[source]
handle_item_complete(item)[source]
handle_item_fail(item)[source]
handle_item_output(item, data)[source]
handle_item_property(item, key, new_value, old_value)[source]
handle_item_task_status(item, task, new_status, old_status)[source]
item_for_broadcast()[source]
item_status()[source]
class seesaw.web.SeesawConnection(session)[source]

Bases: sockjs.tornado.conn.SockJSConnection

A WebSocket server that communicates the state of the warrior.

classmethod broadcast(event, message)[source]
classmethod broadcast_bandwidth()[source]
classmethod broadcast_project_refresh()[source]
classmethod broadcast_projects()[source]
classmethod broadcast_timestamp()[source]
clients = set([])
emit(event_name, message)[source]

tornadoio to sockjs adapter.

classmethod handle_broadcast_message(warrior, message)[source]
classmethod handle_finish_item(runner, pipeline, item)[source]
classmethod handle_project_installation_failed(warrior, project, output)[source]
classmethod handle_project_installed(warrior, project, output)[source]
classmethod handle_project_installing(warrior, project)[source]
classmethod handle_project_refresh(warrior, project, runner)[source]
classmethod handle_project_selected(warrior, project)[source]
classmethod handle_projects_loaded(warrior, projects)[source]
classmethod handle_runner_status(runner, status)[source]
classmethod handle_start_item(runner, pipeline, item)[source]
classmethod handle_warrior_status(warrior, new_status)[source]
instance_id = '7735-0.956574'
item_monitors = {}
on_close()[source]
on_message(message)[source]
on_open(info)[source]
project = None
runner = None
warrior = None
seesaw.web.hash_string(text)[source]

Generate a digest for broadcast message.

seesaw.web.start_runner_server(project, runner, bind_address='localhost', port_number=8001, http_username=None, http_password=None)[source]

Starts a web interface for a manually run pipeline.

Unlike start_warrior_server(), this UI does not contain an configuration or project management panel.

seesaw.web.start_warrior_server(warrior, bind_address='localhost', port_number=8001, http_username=None, http_password=None)[source]

Starts the warrior web interface.

web_util Module

class seesaw.web_util.AuthenticatedApplication(*args, **kwargs)[source]

Bases: tornado.web.Application

class seesaw.web_util.AuthenticationErrorHandler(application, request, **kwargs)[source]

Bases: tornado.web.RequestHandler

initialize(realm='Restricted')[source]
prepare()[source]