GEBA - Project

Now that we know how to run commands in WSL using Python 3, lets get into executing MountainSort commands. Again, I have provided a Jupyter Notebook and all the relevant code in a GitHub repository¹.

Synthesizing Data
Visualizing Data with MountainView
References

Synthesizing Data

First we will synthesize a timeseries that will contain 20 waveforms, sampled at 30 kHz using a few pipelines within MountainSort. I will follow the same process that the MountainSort developers describe in their bash example².

We will begin by importing the relevant modules:

import os

# import the wsl_terminal code
from wsl_terminal import BashConfigure
from wsl_utils import get_windows_filename, get_ubuntu_path

Below I have included a function, run_pipeline_js, that will aid in running the MountainSort commands. You will be able to use this function for any processor (I call it pipeline in the function which is probably confusing, in MountainLab terms it is really a processor) that can be executed with the ml-run-process function within MountainSort. You can find a list of processors using the ml-list-processors command within the WSL terminal. The run_pipeline_js function requires the name of the processor (the pipeline variable), as well as inputs, outputs, and parameters dictionaries.

def run_pipeline_js(pipeline, inputs, outputs, parameters=None, verbose=False):
    """
    This is a function that will run a pipeline given the inputs, outputs, and parameters
    """
    
    command = 'ml-run-process %s ' % (pipeline)
    
    command += '--inputs '

    for key, value in inputs.items():
        if type(value) != list:
            command += '%s:%s ' % (str(key), str(value))
        else:
            for x in value:
                command += '%s:%s ' % (str(key), str(x))

    command += '--outputs '

    for key, value in outputs.items():
        command += '%s:%s ' % (str(key), str(value))

    if parameters is not None:
        command += '--parameters '

        for key, value in parameters.items():
            command += '%s:%s ' % (str(key), str(value))

    if verbose:
        print(command)

    cfg = BashConfigure()

    cfg.win32_wsl_open_bash(None, [
        command,
        'sleep 2'  # you can comment this out, I like to visualize the results for a brief moment
    ], None)

Create the following directory: E:\dataset, we are going to be saving the waveforms that we create into this directory. If the directory does not exist, it will produce an error within MountainSort.

Now we will create random spike waveforms that we will later visualize. This is dependent on the ephys.synthesize_random_waveforms processor. You can learn more about the required inputs, outputs, and parameters for this processor by using the following command: ml-spec ephys.synthesize_random_waveforms in the WSL terminal. Right now we will create a 4 channel signal with 20 waveforms (the default value for this pipeline). The waveform data will be saved to the waveforms_out filename set in the code below. Also a file describing the geometry of the data is produced and saved as the geometry_out filename (also set in the code below). Feel free to play around with the parameters, however I used the following:

# Create some random spike waveforms

'''
This will create a random spike waveform  4 channels (M), 20 waveforms by default, with
upsamplefac*T (T=500 by default) samples, and average peak amplitude of 100.
'''

pipeline = 'ephys.synthesize_random_waveforms'  # the pipeline that creates the random waveforms

# we don't need inputs for this piepline
inputs = {}

waveforms_out = get_ubuntu_path(r'E:\dataset\waveforms_true.mda')
geometry_out = get_ubuntu_path(r'E:\dataset\geom.csv')

# determining where to save our random spike waveform files
outputs = {'waveforms_out': waveforms_out,
          'geometry_out': geometry_out}

# adding an upsampling factor of 13, 4 channels, and avgerage peak amplitude of 100
parameters = {'upsamplefac': 13,
             'M': 4,
             'average_peak_amplitude': 100}

run_pipeline_js(pipeline, inputs, outputs, parameters, verbose=True)

Next we will generate randomized times to place the spike waveforms generated from the previous step. We will be using the ephys.synthesize_random_firings processor this time (example provided below). The output containing the spike times will be saved as the firings_out filename.

# Create random firing event timings

'''
This will create an output .mda with a 3xL matrix size. L is the # of events.
Second row are time stamps, third row are integer unit labels.
'''

pipeline = 'ephys.synthesize_random_firings'  # the pipeline that creates the random firings

# we don't need inputs for this pipeline
inputs = {}

firings_out = get_ubuntu_path(r'E:\dataset\firings.mda')

# determining where to save the firing events file
outputs = {'firings_out': firings_out}

# adding 600 seconds as the duration
parameters = {'duration': 600}

run_pipeline_js(pipeline, inputs, outputs, parameters, verbose=True)

Next we will create the timeseries data that will have the 20 spike waveforms in this example (generated in step 4, with the spike times from step 5). This will use the ephys.synthesize_timeseries processor. This processor requires the firings_out filename that we created in step 5, and the waveforms_out filename from step 4, and will produce a timeseries_out filename that will contain the raw synthesized data for us to test the various packages with.

# Make a synthetic ephys dataset

pipeline = 'ephys.synthesize_timeseries'  # the pipeline that creates the ephys data

# we don't need inputs for this pipeline

inputs = {'firings': firings_out,
         'waveforms': waveforms_out}

timeseries_out = get_ubuntu_path(r'E:\dataset\timeseries.mda')

# determining where to save the timeseries file
outputs = {'timeseries_out': timeseries_out}

# adding 600 seconds as the duration
parameters = {'duration': 600,  # same duration we used for the firing rate
             'waveform_upsamplefac': 13,  # the same upsamplefac we used when creating spike waveforms
             'noise_level': 10}

run_pipeline_js(pipeline, inputs, outputs, parameters, verbose=True)

Visualizing Data with MountainView

Now that we have generated the data, it would be nice to use MountainLab's visualization packages. Before attempting to visualize the data using the WSL, ensure that your X server (Xming) is running. You can visualize by running the following code in Python:

view_command = 'qt-mountainview --raw %s --samplerate 30000 --firings %s' % (timeseries_out, 
                                                                             firings_out)

cfg = BashConfigure()
cfg.win32_wsl_open_bash(None, [view_command,
                               'sleep 2'  # you can comment this out, I like to visualize the results for a brief moment
                              ], None)

The view_command parameter has an input of --raw in this example that dictates the filename of the raw data, and --samplerate which tells MountainView the rate at which the data was recorded (Hz), and --firings, the filename that represents the spike firing times. After the code has been run, MountainView should pop-up showing our twenty spikes (pictured below).

References

Relevant Jupyter Notebook: https://github.com/GeoffBarrett/MountainSortWindows/blob/master/notebooks/ExecutingMountainSortfromWindows.ipynb
MS4 Bash Example: https://github.com/flatironinstitute/mountainsort_examples/blob/master/bash_examples/001_ms4_bash_example/synthesize_dataset.sh

Executing MountainSort Commands from Windows (Using Python 3) Draft

Table of Contents

Synthesizing Data

Visualizing Data with MountainView

References

Comments