Executing MountainSort Commands from Windows (Using Python 3) Draft

Posted by Geoff, Published: 4 years, 12 months ago (Updated: 4 years, 9 months ago)

Now that we know how to run commands in WSL using Python 3, lets get into executing MountainSort commands. Again, I have provided a Jupyter Notebook and all the relevant code in a GitHub repository1.

Synthesizing Data

First we will synthesize a timeseries that will contain 20 waveforms, sampled at 30 kHz using a few pipelines within MountainSort. I will follow the same process that the MountainSort developers describe in their bash example2.

  1. We will begin by importing the relevant modules:
    import os
    
    # import the wsl_terminal code
    from wsl_terminal import BashConfigure
    from wsl_utils import get_windows_filename, get_ubuntu_path​
  2. Below I have included a function, run_pipeline_js, that will aid in running the MountainSort commands. You will be able to use this function for any processor (I call it pipeline in the function which is probably confusing, in MountainLab terms it is really a processor) that can be executed with the ml-run-process function within MountainSort. You can find a list of processors using the ml-list-processors command within the WSL terminal. The run_pipeline_js function requires the name of the processor (the pipeline variable), as well as inputs, outputs, and parameters dictionaries. 
    def run_pipeline_js(pipeline, inputs, outputs, parameters=None, verbose=False):
        """
        This is a function that will run a pipeline given the inputs, outputs, and parameters
        """
        
        command = 'ml-run-process %s ' % (pipeline)
        
        command += '--inputs '
    
        for key, value in inputs.items():
            if type(value) != list:
                command += '%s:%s ' % (str(key), str(value))
            else:
                for x in value:
                    command += '%s:%s ' % (str(key), str(x))
    
        command += '--outputs '
    
        for key, value in outputs.items():
            command += '%s:%s ' % (str(key), str(value))
    
        if parameters is not None:
            command += '--parameters '
    
            for key, value in parameters.items():
                command += '%s:%s ' % (str(key), str(value))
    
        if verbose:
            print(command)
    
        cfg = BashConfigure()
    
        cfg.win32_wsl_open_bash(None, [
            command,
            'sleep 2'  # you can comment this out, I like to visualize the results for a brief moment
        ], None)​​
  3. Create the following directory: E:\dataset, we are going to be saving the waveforms that we create into this directory. If the directory does not exist, it will produce an error within MountainSort.
  4. Now we will create random spike waveforms that we will later visualize. This is dependent on the ephys.synthesize_random_waveforms processor. You can learn more about the required inputs, outputs, and parameters for this processor by using the following command: ml-spec ephys.synthesize_random_waveforms in the WSL terminal. Right now we will create a 4 channel signal with 20 waveforms (the default value for this pipeline). The waveform data will be saved to the waveforms_out filename set in the code below. Also a file describing the geometry of the data is produced and saved as the geometry_out filename (also set in the code below). Feel free to play around with the parameters, however I used the following:
    # Create some random spike waveforms
    
    '''
    This will create a random spike waveform  4 channels (M), 20 waveforms by default, with
    upsamplefac*T (T=500 by default) samples, and average peak amplitude of 100.
    '''
    
    pipeline = 'ephys.synthesize_random_waveforms'  # the pipeline that creates the random waveforms
    
    # we don't need inputs for this piepline
    inputs = {}
    
    waveforms_out = get_ubuntu_path(r'E:\dataset\waveforms_true.mda')
    geometry_out = get_ubuntu_path(r'E:\dataset\geom.csv')
    
    # determining where to save our random spike waveform files
    outputs = {'waveforms_out': waveforms_out,
              'geometry_out': geometry_out}
    
    # adding an upsampling factor of 13, 4 channels, and avgerage peak amplitude of 100
    parameters = {'upsamplefac': 13,
                 'M': 4,
                 'average_peak_amplitude': 100}
    
    run_pipeline_js(pipeline, inputs, outputs, parameters, verbose=True)​
  5. Next we will generate randomized times to place the spike waveforms generated from the previous step. We will be using the ephys.synthesize_random_firings processor this time (example provided below). The output containing the spike times will be saved as the firings_out filename.
    # Create random firing event timings
    
    '''
    This will create an output .mda with a 3xL matrix size. L is the # of events.
    Second row are time stamps, third row are integer unit labels.
    '''
    
    pipeline = 'ephys.synthesize_random_firings'  # the pipeline that creates the random firings
    
    # we don't need inputs for this pipeline
    inputs = {}
    
    firings_out = get_ubuntu_path(r'E:\dataset\firings.mda')
    
    # determining where to save the firing events file
    outputs = {'firings_out': firings_out}
    
    # adding 600 seconds as the duration
    parameters = {'duration': 600}
    
    run_pipeline_js(pipeline, inputs, outputs, parameters, verbose=True)​
  6. Next we will create the timeseries data that will have the 20 spike waveforms in this example (generated in step 4, with the spike times from step 5). This will use the ephys.synthesize_timeseries processor. This processor requires the firings_out filename that we created in step 5, and the waveforms_out filename from step 4, and will produce a timeseries_out filename that will contain the raw synthesized data for us to test the various packages with.
    # Make a synthetic ephys dataset
    
    pipeline = 'ephys.synthesize_timeseries'  # the pipeline that creates the ephys data
    
    # we don't need inputs for this pipeline
    
    inputs = {'firings': firings_out,
             'waveforms': waveforms_out}
    
    timeseries_out = get_ubuntu_path(r'E:\dataset\timeseries.mda')
    
    # determining where to save the timeseries file
    outputs = {'timeseries_out': timeseries_out}
    
    # adding 600 seconds as the duration
    parameters = {'duration': 600,  # same duration we used for the firing rate
                 'waveform_upsamplefac': 13,  # the same upsamplefac we used when creating spike waveforms
                 'noise_level': 10}
    
    run_pipeline_js(pipeline, inputs, outputs, parameters, verbose=True)​

Visualizing Data with MountainView

Now that we have generated the data, it would be nice to use MountainLab's visualization packages. Before attempting to visualize the data using the WSL, ensure that your X server (Xming) is running. You can visualize by running the following code in Python:

view_command = 'qt-mountainview --raw %s --samplerate 30000 --firings %s' % (timeseries_out, 
                                                                             firings_out)

cfg = BashConfigure()
cfg.win32_wsl_open_bash(None, [view_command,
                               'sleep 2'  # you can comment this out, I like to visualize the results for a brief moment
                              ], None)

The view_command parameter has an input of --raw in this example that dictates the filename of the raw data, and --samplerate which tells MountainView the rate at which the data was recorded (Hz), and --firings, the filename that represents the spike firing times. After the code has been run, MountainView should pop-up showing our twenty spikes (pictured below).

References

  1. Relevant Jupyter Notebook: https://github.com/GeoffBarrett/MountainSortWindows/blob/master/notebooks/ExecutingMountainSortfromWindows.ipynb

  2. MS4 Bash Example: https://github.com/flatironinstitute/mountainsort_examples/blob/master/bash_examples/001_ms4_bash_example/synthesize_dataset.sh

Comments

Post Comment