BLUMYCELIUM

BLUMYCELIUM: Async micro-services, 100% in python from Bluwr

PLEASE NOTE: This documentation is a work in progress

This tool os provided for free by bluwr.com. We are building a 100% text based publication platform. A calm space free of any addictive feature.

Here are a few things that you can do to support us:

BLUMYCELIUM is our tool for arm’s-length microservices management and orchestration. It allows for the splitting of a monolithic application into several small parts that run asynchronously in the same environment and can be tested separately.

It was developed to be easy to learn and does not require more than python knowledge to achieve results that would normally be implemented using more complex DevOps orchestration tools mediated through REST APIs. To achieve this, Blumycellium relies heavily on python introspection capabilities to follow the flow of variable updates and transparently derives execution and orchestration graphs. By allowing services to be separate programs BLUMYCELIUM applications can also bypass the python GIL. BLUMYCELIUM also remembers the source code of all tasks as well as tracebacks for all exceptions for easy debugging. BLUMYCELIUM is implemented using the flexible multimodal ArangoDB database for storing what we call the: Mycelium. The repository of everything needed for variable execution graphs, source codes, orchestrations and failure reporting.

BLUMYCELIUM allows you to write complex microservice orchestration all in python. You can divide a complex application into many smaller parts that can be tested independtly and ran asynchronously. These smaller parts are agents called Machine Elves and the Mycelium is the database they use to communicate.

Machine Elf <- Mycelium -> Machine Elf

Arm’s-length microservices?

Arm’s-length microservices are microservices that trust each other completely. Only trusted elves should be allowed inside a mycelium. Myceliums are extremely convenient but not secure by default. It’s you job to make sure that your myceliums are properly protected.

Glossary Mycelium?

Machine Elf: an indepedent agent that can perform a set of tasks. Elves can be processes or threads of the same application, or completely independant application hosted locally or remotely. Task: a function from a machine elf whose name starts with task_. Job: a task to run.

Mycelium: a datastructure that stores many things related to the application like: - Job orchestration - Variable flows (how to compute values from python instructions) - Elves documentation, source codes and revisions - Tasks source code & documentation

For now the only mycelium implementation available uses the ArangoDB database and thus can be setup anywhere: - locally - on the cloud - in a separate container - inside the same container

Feedback

If you have any suggestions about features or feedback about the documentation please open github issues https://github.com/bluwr-tech/blumycelium.

Installation

BLUMYCELIUM is available on pypi. Check the changelog file in github to get a specific version.

pip install blumycelium

Note that you will need a working ArangoDB database for it to run (very easy task with docker). Please have a look a the Quickstart page for a detailed installation process with the database.

Quickstart

Get arangodb

Using docker is the easiest https://www.arangodb.com/download-major/docker/

example:

docker run -p 8529:8529 -e ARANGO_ROOT_PASSWORD=openSesame arangodb/arangodb

Install Blumycelium using github

Not available from pypi yet. Once we think it is stable enough we will publish it but for now please install it from github.

pip install git+https://github.com/bluwr-tech/blumycelium.git

Run the simple example below

import pyArango.connection as ADB
import blumycelium.mycelium as myc
import blumycelium.machine_elf as melf

import time

#This a very basic demo to show how BLUMUCELIUM
#is basically just python code. An Elf sends a message
#to another elf that prints it
#NOTICE THAT TASK ARE FUNCTION WITH NAMES STARING WITH 'task_'

class PrinterElf(melf.MachineElf):
    """
    Notice the hint for the return. This is mandatory.
    Type hints can also be used for arguments to ensure that tasks get
    arguments of the right type.
    """
    def task_print_it(self, to_print) -> None:
        print(">>> Machine >>> Elf >>> Printer: '%s'" % to_print)

class SenderElf(melf.MachineElf):
    """
    Notice the hint for the return. This is mandatory.
    Type hints can also be used for arguments to ensure that tasks get
    arguments of the right type.
    Task must return either None or a dict. Here the return is a tuple of keys to a dictioray.
    A list would also work, as well as a dict of types example: {"value": float}
    """
    def task_send(self, to_send) -> ("value", ):
        return {
            "value": to_send
        }

def init_myc():
    connection = ADB.Connection(
        arangoURL = "http://127.0.0.1:8529",
        username = "root",
        password = "root"
    )

    mycellium = myc.ArangoMycelium(
        connection=connection,
        name="mycellium"
    )

    mycellium.init(init_db=True)

    printer = PrinterElf("The Elf Printer", mycellium)
    printer.register(store_source=True)

    sender = SenderElf("The Sender Elf", mycellium)
    sender.register(store_source=True)

    ret = sender.task_send("a message sent on: %s" % time.ctime())

    printer.task_print_it(ret["value"])

    sender.start_jobs(store_failures=True, raise_exceptions=True)
    print("===> print")
    printer.start_jobs(store_failures=True, raise_exceptions=True)

if __name__ == '__main__':
    init_myc()

Where to go from there

To go further, check the other examples we build to get you started https://github.com/bluwr-tech/blumycelium/tree/main/demos

Code Design

Here are some code design considerations when building a tool with blumycelium. Important to note that these are just suggestions, not restrictions. Your code will run even if you do it your own way.

Elf class parameter

As you know by now, an elf’s purpose is to execute tasks (python functions named task_*. Tasks functions take parameters, however, if some parameter is constant no matter what the task is, you might consider creating an Elf class containing a dedicated argument. In that case we suggest implementing a separate constructor instead of using the default one.

The elf class instance will be created in 2 different places:

  • When writing the orchestration code (saving the tasks in arangodb).

  • When writing the code that will actually execute the tasks.

In order to avoid parameter values sunchronization issues, hereunder is our code design suggestion.

Suggestion :

import blumycelium.machine_elf as melf

class MyDummyElf(melf.MachineElf):

    def initialize(self, my_custom_parameter):
        self.my_custom_parameter = my_custom_parameter

Instead of:

import blumycelium.machine_elf as melf

class MyDummyElf(melf.MachineElf):

    def __init__(self, uid, mycelium, my_custom_parameter):
        self.my_custom_parameter = my_custom_parameter
        super().__init__(uid, mycelium)

Async demo

(WIP)

Introduction

Here are some details about the demo provided in the repo: demos/daemons/. The file async_orchestration.py is basically an asynchronous version of the demo in sync_orchestration.py. It starts 3 types of processes (elves):

  • Animals: this class represents an animal that will be stored in the database (here the simple json file). It just contains the animal’s species, and weight.

  • Storage: a simple interface to a database that all elves connect to. It can be anything, here, for the sake of simplicity it is just a json file.

  • Stats: that will compute some statistics using data stored in the database (again, just a json file here). 3 of them will be started. One calculating the average size, one the min and the last one the max size.

This example shows how these elves will run independently and their executions and dependencies will be all handled by bluycelium using Arangodb behind the scenes.

Explain the files

  • elves.py: code representing the elves (or processes). Simple classes that inherit from MachineElf class and have at least one function starting by task_ that tells what the elf will do.

  • async_orchestration.py: this code is responsible for two things.
    • Create the mycelium, which is an Arangodb database storing the processes, their states and dependencies.

    • Tell the elves (processes) what they should do.

      for nb in range(100):
        print("Sending: %s" % nb)
        mesurement = animals.task_get_animal_data()
        store.task_save_animal_data(species=mesurement["species"], weight=mesurement["weight"])
        #print stats every 10 iterations
        if nb % 5 ==0:
            mean = mean_calc.task_calculate_means()
            mins = min_calc.task_calculate_mins()
            maxs = max_calc.task_calculate_maxs()
            printer.task_print_stats(means=mean["means"], mins=mins["mins"], maxs=maxs["maxs"])
        time.sleep(1)
      

      Important note: the code above will not be the one that will be executed exactly. lumycellium will do some introspection and store the processes and their dependencies in the mycelium and execute the code when the elves are actaully started with the start_job() function. In the sync_orchestration.py example it is done in the same file, here the purpose is to show you that they can run independently and blumycelium will take care of any dependencies between them.

  • *_daemon.py: these files start the different elves separately. They create an elf class, for example:

    elf = Storage("animals data store", mycellium)
    

    The elf is uniquely identified by its name “animals data store” so when creating an elf with this name, blumycelium will know to fetch the one already registered in the mycelium. Once the elf is fetched, call the start_job() function and blumycelium will check if all the jobs that it depends on are done (if any) and then start executing the tasks.

Make it run

  1. Create the mycelium and register elves and tell them what to do and how to interact together (orchestration)

python async_orchestration.py
  1. Start the elves in any order in separate windows or tabs and watch the magic happen!

# Start the elf generating random animals with species and weight
python animals_elf_deamon.py
# Start the elf storing the animals generated by the animals_elf_deamon.py and store them in the database (json file here)
python storage_elf_deamon.py
# Start the elves doing some calculations with the data
python stats_elf_deamon.py calc1
python stats_elf_deamon.py calc2
python stats_elf_deamon.py calc3
# Start the elf generating the report from the stats generated
python formater_elf_deamon.py
BLUMYCELIUM
  1. You can now connect to the arangodb database and look at the graphs if you are curious about how it works behind the scenes.

  2. You should now be ready to build your own elves and orchestrations. If you have any suggestions about features or feedback about the documentation please open github issues https://github.com/bluwr-tech/blumycelium

Limits

Tasks paramters

Elves execute tasks which are simply python function. These functions can take argument which have to be stored in Arangodb for later execution. Only a few restrictions apply to task parameters.

Allowed:

  • Numbers (integers, floats etc.)

  • Strings

  • Booleans

  • Lists

  • Dictionnaries

They can’t be:

  • Objects

  • Functions

For lists and dictionnaries, for now make sure they are not too big. Starting ~100,000 elements, the way we store them in Arangodb can cause performance issues. Check the corresponding issue to see where we are at https://github.com/bluwr-tech/blumycelium/issues/5.

Mycelium

The mycelium is the data structure that stores everython related to the BLUMYCELIUM application.

Mycelium

class blumycelium.mycelium.ArangoMycelium(connection, name)[source]

docstring for ArangoMycelium. A mycelium over ArangoDB

complete_job(job_id)[source]

mark a job as successful

drop()[source]

delete all documents in the mycelium

drop_jobs()[source]

delete all information related to jobs

get_job(job_id)[source]

get a job

get_job_parameters(job_id)[source]

return the parameters for a job

get_job_status(job_id)[source]

return the status of a job

get_jobs(elf_uid: str, all_jobs=False, status_restriction=['pending'])[source]

return all jobs for an elf

get_result(result_id)[source]

return the result of a job

init(init_db=False, users_to_create=None)[source]

init_db: initialise the database users_to_create: list of dicts {username, password}

is_job_ready(job_id)[source]

return True if the job is ready to run

push_job(job)[source]

push a job to the mycelium

register_job_failure(exc_type, exc_value, exc_traceback, job_id)[source]

register a job failure

register_machine_elf(machine_elf, store_source)[source]

register and elf in the mycellium

start_job(job_id)[source]

mark a job as running

store_results(job_id, results: dict)[source]

store the results of a job

update_job_status(job_id, status)[source]

update the status of a job

Models

Database schemas

Models

class blumycelium.models.Failures(database, jsonData)[source]

Schema of a failure in the database

class blumycelium.models.JobFailures(database, jsonData)[source]

Edge connecting jobs to failures

class blumycelium.models.JobFailures_graph(database, jsonInit)[source]

Graph connecting jobs to failures

class blumycelium.models.JobParameters(database, jsonData)[source]

Schema of a job parameter in the database. a job parameter is an edge that associates a parameter to task argument

class blumycelium.models.JobParameters_graph(database, jsonInit)[source]

Graph connecting jobs to parameters

class blumycelium.models.JobToJob(database, jsonData)[source]

Edge connecting jobs that depend on each other

class blumycelium.models.Jobs(database, jsonData)[source]

Schema of a job in the database

class blumycelium.models.Jobs_graph(database, jsonInit)[source]

The job orchestration graph

class blumycelium.models.MachineElves(database, jsonData)[source]

Schema of a machine elf in the database

class blumycelium.models.MachineElvesRevisions(database, jsonData)[source]

Schema of a machine elf revision in the database

class blumycelium.models.Parameters(database, jsonData)[source]

Schema of a parameter in the database. A parameter is a variable passed as an argument to a task

class blumycelium.models.Results(database, jsonData)[source]

Schema of a result in the database

Graph Parameters

This submodule handles the variable flow part of BLUMYCELIUM. This is what allows BLUMYCELIUM to derive the set of instructions necessary to compte a vriable.

Graph Parameters

class blumycelium.graph_parameters.CodeBlock(init_code=None, return_statement=None)[source]

a code block of python code to run

format(**string_kwargs)[source]

replace the name of variables by their myc_back_variables references

run(**myc_back_variables)[source]

run the code block

to_dict()[source]

return a dictionary version of the code block

exception blumycelium.graph_parameters.ExecutionError(code)[source]
class blumycelium.graph_parameters.GraphParameter(uid=None)[source]

A graph of Parameters

add_dependencies(*deps)[source]

add dependencies (other GraphParaemters) needed to compute the value

classmethod build_from_traversal(trav: dict, pull_origin_function=None)[source]

build a parameter from a traversal dictionary

make(visited_nodes=None, is_root=False)[source]

compute the value of the paraneter and return it

pp_traverse(full_representation=False, representation_attributes=['value', 'code_block', 'uid', 'python_id'], print_it=True)[source]

a pretty print of the graph representation with dependencies representation_attributes: list of field to print

set_code_block(init_code, return_statement, **dependencies)[source]

set the code block of the parameter

set_origin(uid, pull_origin_function=None)[source]

set the origin (address) of the parameter and function to retreive the value from the origin

set_pull_origin_function(fct)[source]

set the function to pull jvalue form the origin

set_value(value)[source]

set the static value of the parameter

to_dict(reccursive=False, visited_nodes=None, copy_values=True)[source]

return a dict representation of the parameter recursively if asked to

traverse(visited_nodes=None, is_root=True, root_uid=None, to_dict=True)[source]

traverse the graph dependency tree and return a dictionary representing it

class blumycelium.graph_parameters.Value(*args, **kwargs)[source]

A wrapper for a graph parameter

make(force=False)[source]

compute a return the value represented by the Value object

pp_traverse(*args, **kwargs)[source]

pretty print the value and it’s dependencies

set_code_block(*args, **kwargs)[source]

set the code block to run

set_value(value)[source]

set the static value

to_dict(*args, **kwargs)[source]

return a dictionary representing the value

traverse(*args, **kwargs)[source]

return a dictionary representing the value and it’s dependencies

Machine Elf

An elf is an agent that perfoms tasks and share information on the mycelium. All elves must inherit from the MachineElf class.

Machine Elf

class blumycelium.machine_elf.Job(task, run_job_id, worker_elf, parameters, start_date, completion_date, status, mycelium, return_placeholder, dependencies)[source]

a Job for an elf

commit()[source]

save the job to the mycelium

class blumycelium.machine_elf.MachineElf(uid, mycelium)[source]

An elf that runs tasks

find_tasks()[source]

find all function whose name starts by task_

get_jobs()[source]

retuen the list of jobs for an elf

inspect_self()[source]

inspect the elf to get the source code etc…

register(store_source=False)[source]

register the elf to the mycelium

run_task(job_id: str, task_name: str, parameters: dict, store_failures: bool, raise_exceptions: bool)[source]

run a task for an elf

start_jobs(store_failures=True, raise_exceptions=True)[source]

start a job for an elf

class blumycelium.machine_elf.Task(machine_elf, function, name)[source]

Represent a task for an elf

inspect_function()[source]

inspect the function of a task

run(job_id, *args, **kwargs)[source]

run the task

wrap(*args, **kwargs)[source]

wrap the function inside a new function that creates a Job

class blumycelium.machine_elf.TaskParameters(fct, run_job_id, worker_elf)[source]

Parameters of a task

classmethod develop(mycelium, dct_params: dict)[source]

compute the value of parameters

find_placeholders_in_key_value_iterrator(iterator)[source]

find placeholders in dicts lists etc…

get_job_dependencies()[source]

return the dependencies of a job (jobs that must run before) based on the parameters received

get_parameter_dict()[source]

return a dictionary of parameters

validate()[source]

returns placeholders

class blumycelium.machine_elf.TaskReturnPlaceHolder(worker_elf, task_function, run_job_id)[source]

A place for the return of a task. Works as a dict, where each value is a key

get_result_id(name)[source]

return the result_id for value in the placeholder

make_placeholder()[source]

make and validate the placeholder

class blumycelium.machine_elf.ValuePlaceholder(*args, **kwargs)[source]

A placeholder for a value return by a task

set_origin(result_id)[source]

set the result id represented by the placeholder

Utils

Useful functions

Utils

blumycelium.utils.get_hash_key(to_hash: str, prefix=None, suffix=None)[source]

return a hash of to_hash that can serve as a key

blumycelium.utils.get_random_variable_name()[source]

returns a string that can serve as a python variable name

blumycelium.utils.gettime()[source]

return timestamp

blumycelium.utils.getuid()[source]

returns a random id that can serve as a key

blumycelium.utils.inpsect_none_if_exception_or_empty(obj_to_inspect, inspect_fct_name)[source]

runs an inspect function on obj_to_inspect and returns None if the result is empty or returns an exception

blumycelium.utils.legalize_key(key)[source]

returns a string that can serve as a valid key for database

Exceptions

Custom exceptions thrown by BLUMYCELIUM

Custom Exceptions

exception blumycelium.the_exceptions.EmptyParameter[source]

Represents an empty parameter

Indices and tables