Abstract
Spike sorting is the process of retrieving the spike times of individual neurons that are present in an extracellular neural recording. Over the last decades, many spike sorting algorithms have been published. In an effort to guide a user towards a specific spike sorting algorithm, given a specific recording setting (i.e., brain region and recording device), we provide an opensource graphical tool for the generation of hybrid groundtruth data in Python. Hybrid groundtruth data is a datadriven modelling paradigm in which spikes from a single unit are moved to a different location on the recording probe, thereby generating a virtual unit of which the spike times are known. The tool enables a user to efficiently generate hybrid groundtruth datasets and make informed decisions between spike sorting algorithms, finetune the algorithm parameters towards the used recording setting, or get a deeper understanding of those algorithms.
This is a preview of subscription content, access via your institution.
Notes
 1.
The tool is available on https://github.com/jwouters91/shybrid.
 2.
Please consult the https://phy.readthedocs.io/en/latest/ for more information about the templategui format.
References
Allen, B.D., MooreKochlacs, C., Bernstein, J.G., Kinney, J., Scholvin, J., Seoane, L., Chronopoulos, C., Lamantia, C., Kodandaramaiah, S.B., Tegmark, M., & et al. (2018). Automated in vivo patch clamp evaluation of extracellular multielectrode array spike recording capability. Journal of neurophysiology.
Aydın, Ċ., Couto, J., Giugliano, M., Farrow, K., & Bonin, V. (2018). Locomotion modulates specific functional cell types in the mouse visual thalamus. Nature Communications, 9(1), 1–12.
Blatt, M., Wiseman, S., & Domany, E. (1996). Superparamagnetic clustering of data. Physical Review Letters, 76(18), 3251.
Buccino, A.P., & Einevoll, G.T. (2019). Mearec: a fast and customizable testbench simulator for groundtruth extracellular spiking activity, bioRxiv (pp. 691642).
Buccino, A.P., Hurwitz, C.L., Magland, J., Garcia, S., Siegle, J.H., Hurwitz, R., & Hennig, M.H. (2019). Spikeinterface, a unified framework for spike sorting, BioRxiv (pp. 796599).
CamunasMesa, L.A., & Quiroga, R.Q. (2013). A detailed and fast model of extracellular recordings. Neural Computation, 25(5), 1191–1212.
Carlson, D., & Carin, L. (2019). Continuing progress of spike sorting in the era of big data. Current Opinion in Neurobiology, 55, 90– 96.
Chung, J.E., Magland, J.F., Barnett, A.H., Tolosa, V.M., Tooker, A.C., Lee, K.Y., Shah, K.G., Felix, S.H., Frank, L.M., & Greengard, L.F. (2017). A fully automated approach to spike sorting. Neuron, 95(6), 1381–1394.
Einevoll, G.T., Franke, F., Hagen, E., Pouzat, C., & Harris, K. D. (2012). Towards reliable spiketrain recordings from thousands of neurons with multielectrodes. Current Opinion in Neurobiology, 22(1), 11–17.
Franke, F., Quiroga, R.Q., Hierlemann, A., & Obermayer, K. (2015). Bayes optimal template matching for spike sorting–combining fisher discriminant analysis with optimal filtering. Journal of Computational Neuroscience, 38(3), 439–459.
Gibson, S., Judy, J.W., & Marković, D. (2012). Spike sorting: The first step in decoding the brain. IEEE Signal Processing Magazine, 29(1), 124–143.
Gligorijević, I., van Dijk, J.P., Mijović, B., Van Huffel, S., Blok, J.H., & De Vos, M. (2013). A new and fast approach towards semg decomposition. Medical & Biological Engineering & Computing, 51 (5), 593–605.
Gouwens, N.W., Berg, J., Feng, D., Sorensen, S.A., Zeng, H., Hawrylycz, M.J., Koch, C., & Arkhipov, A. (2018). Systematic generation of biophysically detailed models for diverse cortical neuron types. Nature Communications, 9(1), 1–13.
Grün, S., & Rotter, S. (2010). Analysis of parallel spike trains Vol. 7. Berlin: Springer.
Hagen, E., Ness, T.V., Khosrowshahi, A., Sørensen, C., Fyhn, M., Hafting, T., Franke, F., & Einevoll, G.T. (2015). Visapy: a python tool for biophysicsbased generation of virtual spiking activity for evaluation of spikesorting algorithms. Journal of Neuroscience Methods, 245, 182–204.
Hines, M.L., & Carnevale, N.T. (1997). The neuron simulation environment. Neural Computation, 9(6), 1179–1209.
Holobar, A., & Zazula, D. (2007). Multichannel blind source separation using convolution kernel compensation. IEEE Transactions on Signal Processing, 55(9), 4487–4496.
Hunt, D.L., Lai, C., Smith, R.D., Lee, A.K., Harris, T.D., & Barbic, M. (2019). Multimodal in vivo brain electrophysiology with integrated glass microelectrodes. Nature Biomedical Engineering, 1.
Hutchison, W., Allan, R., Opitz, H., Levy, R., Dostrovsky, J., Lang, A., & Lozano, A. (1998). Neurophysiological identification of the subthalamic nucleus in surgery for parkinson’s disease. Annals of Neurology: Official Journal of the American Neurological Association and the Child Neurology Society, 44(4), 622–628.
Jun, J.J., Mitelut, C., Lai, C., Gratiy, S., Anastassiou, C., & Harris, T.D. (2017a). Realtime spike sorting platform for highdensity extracellular probes with groundtruth validation and drift correction, bioRxiv (pp. 101030).
Jun, J.J., Steinmetz, N.A., Siegle, J.H., Denman, D.J., Bauza, M., Barbarits, B., Lee, A.K., Anastassiou, C.A., Andrei, A., AydıN, Ċ., & et al. (2017b). Fully integrated silicon probes for highdensity recording of neural activity. Nature, 551(7679), 232.
Khatoun, A., Asamoah, B., & Mc Laughlin, M. (2017). Simultaneously excitatory and inhibitory effects of transcranial alternating current stimulation revealed using selective pulsetrain stimulation in the rat motor cortex. Journal of Neuroscience, 37(39), 9389–9402.
Lewicki, M.S. (1998). A review of methods for spike sorting: the detection and classification of neural action potentials. Network: Computation in Neural Systems, 9(4), R53–R78.
Lindén, H., Hagen, E., Leski, S., Norheim, E.S., Pettersen, K. H., & Einevoll, G.T. (2014). Lfpy: a tool for biophysical simulation of extracellular potentials generated by detailed model neurons. Frontiers in Neuroinformatics, 7, 41.
Lopez, C.M., Putzeys, J., Raducanu, B.C., Ballini, M., Wang, S., Andrei, A., Rochus, V., Vandebriel, R., Severi, S., Van Hoof, C., & et al. (2017). A neural probe with up to 966 electrodes and up to 384 configurable channels in 0.13μ m soi cmos. IEEE Transactions on Biomedical Circuits and Systems, 11(3), 510–522.
Markram, H., Muller, E., Ramaswamy, S., Reimann, M.W., Abdellah, M., Sanchez, C.A., Ailamaki, A., AlonsoNanclares, L., Antille, N., Arsever, S., & et al. (2015). Reconstruction and simulation of neocortical microcircuitry. Cell, 163(2), 456–492.
Marre, O., Amodei, D., Deshmukh, N., Sadeghi, K., Soo, F., Holy, T.E., & Berry, M.J. (2012). Mapping a complete neural population in the retina. Journal of Neuroscience, 32(43), 14859–14873.
Maynard, E.M., Nordhausen, C.T., & Normann, R.A. (1997). The utah intracortical electrode array: a recording structure for potential braincomputer interfaces. Electroencephalography and Clinical Neurophysiology, 102(3), 228–239.
Merletti, R., & Farina, D. (2016). Surface electromyography: physiology, engineering and applications. New York: Wiley.
Moser, E.I., Kropff, E., & Moser, M.B. (2008). Place cells, grid cells, and the brain’s spatial representation system. Annu. Reviews in the Neurosciences, 31, 69–89.
Neto, J.P., Lopes, G., Frazao, J., Nogueira, J., Lacerda, P., Baiao, P., Aarts, A., Andrei, A., Musa, S., Fortunato, E., & et al. (2016). Validating silicon polytrodes with paired juxtacellular recordings: method and dataset. Journal of Neurophysiology, 116(2), 892–903.
Pachitariu, M., Steinmetz, N.A., Kadir, S.N., Carandini, M., & Harris, K.D. (2016). Fast and accurate spike sorting of highchannel count probes with kilosort. Advances in Neural Information Processing Systems, 4448–4456.
Quiroga, R.Q., Nadasdy, Z., & BenShaul, Y. (2004). Unsupervised spike detection and sorting with wavelets and superparamagnetic clustering. Neural Computation, 16(8), 1661–1687.
Ramaswamy, S., Courcol, J.D., Abdellah, M., Adaszewski, S.R., Antille, N., Arsever, S., Atenekeng, G., Bilgili, A., Brukau, Y., Chalimourda, A., & et al. (2015). The neocortical microcircuit collaboration portal: a resource for rat somatosensory cortex. Frontiers in Neural Circuits, 9, 44.
Rodriguez, A., & Laio, A. (2014). Clustering by fast search and find of density peaks. Science, 344(6191), 1492–1496.
Rossant, C. (2020). cortexlab/phy. [Online]. Available: https://github.com/cortexlab/phy.
Rossant, C., Kadir, S.N., Goodman, D.F., Schulman, J., Hunter, M.L., Saleem, A.B., Grosmark, A., Belluscio, M., Denfield, G.H., Ecker, A.S., & et al. (2016). Spike sorting for large, dense electrode arrays. Nature Neuroscience, 19(4), 634.
Rutishauser, U., Schuman, E.M., & Mamelak, A.N. (2006). Online detection and sorting of extracellularly recorded action potentials in human medial temporal lobe recordings, in vivo. Journal of Neuroscience Methods, 154(12), 204–224.
Schwartz, A.B. (2004). Cortical neural prosthetics. Annu. Reviews in the Neurosciences, 27, 487–507.
Sukiban, J., Voges, N., Dembek, T.A., Pauli, R., VisserVandewalle, V., Denker, M., Weber, I., Timmermann, L., & Grün, S. (2019). Evaluation of spike sorting algorithms: Application to human subthalamic nucleus recordings and simulations. Neuroscience, 414, 168–185.
Trautmann, E.M., Stavisky, S.D., Lahiri, S., Ames, K.C., Kaufman, M.T., O’Shea, D.J., Vyas, S., Sun, X., Ryu, S.I., Ganguli, S., & et al. (2019). Accurate estimation of neural population dynamics without spike sorting. Neuron, 103(2), 292–308.
Wouters, J., Kloosterman, F., & Bertrand, A. (2018). Towards online spike sorting for highdensity neural probes using discriminative template matching with suppression of interfering spikes. Journal of Neural Engineering, 15(5), 056005.
Yger, P., Spampinato, G.L., Esposito, E., Lefebvre, B., Deny, S., Gardella, C., Stimberg, M., Jetter, F., Zeck, G., Picaud, S., & et al. (2018). A spike sorting toolbox for up to thousands of electrodes validated with ground truth recordings in vitro and in vivo. Elife, 7, e34518.
Acknowledgment
The authors would like to thank Jonathan Dan and Jonathan Moeyersons for their time spent on thoroughly testing the software and for their valuable feedback.
Author information
Affiliations
Corresponding author
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work was carried out at the ESAT Laboratory of KU Leuven, in the frame of KU Leuven Special Research Fund projects C14/16/057, and the Research Foundation Flanders (FWO) project FWO G0D7516N. This project has received funding from the European Research Council (ERC) under the European Union’s Horizon 2020 research and innovation programme (grant agreement No 802895). This research received funding from the Flemish Government under the ”Onderzoeksprogramma Artificië le Intelligentie (AI) Vlaanderen” programme. The scientific responsibility is assumed by its authors.
Appendix
Appendix
A Auto hybridization fitting factor bounds
The calculation of the fitting factor bounds during the automatic hybridization is based on robust statistics, which are commonly used for the detection and removal of outliers. The automatic bounds selection is rather conservative, i.e., it is likely that quite a few good spikes are excluded from the hybridization when using the automated approach.
Consider \({\mathscr{B}}^{\left (n\right )} = \left \{\log _{10} \ {\upbeta }_{s}^{\left (n\right )} \ \vert \ s \in \mathcal {S}^{\left (n\right )} \right \}\) which is the set of the logarithm of the fitting factors (see “Hybrid GroundTruth Model”) for a certain neuron n. The logarithm is used to be able to also remove close to zero fitting factors based on simple statistics. Given \({\mathscr{B}}^{\left (n\right )}\), the first and third quartile are calculated, denoted by Q_{1} and Q_{3} respectively. From those quartile values the interquartile range (IQR) is calculated as IQR = Q_{3} −Q_{1}. From those statistics the bounds are calculated:
and
where the IQR scaling factor (i.e. \(\frac {3}{4}\)) was determined experimentally.
B Auto hybridization random unit relocation
During the automatic hybridization, a random unit relocation is calculated for every neuron. For this relocation, only a shift in the ydirection is considered. The random shift is determined by drawing a yposition on the probe grid model (see “Hybrid GroundTruth Model”) from a discrete uniform distribution. This random yposition is the yposition to which the channel with the maximal deflection in the spike template is shifted to. In this way we avoid that the complete template is shifted off the probe. The actual shift can then be calculated as the random yposition minus the yposition of the channel with maximal deflection in the original template. A minimum shift of two channels is enforced, to make sure that the reinserted unit is sufficiently separable from the original unit.
C External template import
When an external template is imported, there are no spike times available, neither is the scaling known. The spike occurrences are modeled as a poisson point process. The interspike interval Δ_{ISI} is then modelled by drawing from an exponential distribution:
where λ represents the desired spike rate. Every interspike interval sample \(\hat {\Delta }_{\text {ISI}}\) is enforced to last at minimum the userdefined refractory period \({\Delta }_{\min \limits }\):
The actual simulated discrete spike times k_{sim} are obtained by calculating the cumulative sum over the interspike interval samples. Those spike times are then discretized by multiplying them with the recording sampling frequency and rounding each product to its nearest integer. This gives rise to a set of discrete spike times \(\mathcal {S}^{\text {ext}} = \left \{ k_{\text {sim}} \right \}\).
The template scaling is derived from the userdefined desired peaksignaltonoise ratio (PSNR \(= 10\log _{10}\frac {P_{\text {peak}}}{P_{\text {noise}}}\)). The scaling factor is calculated as follows:
with P_{peak} equal to the square of the peak absolute value over all channels of the external template and P_{noise} equal to a robust estimate (based on the median absolute deviation) of the noise variance of the channel on which the template reaches its peak absolute value.
The hybrid data generated from an external template can then be described as follows:
where \(\mathbf {t}_{c,\left (x,y\right )}^{\text {ext}}\) denotes the imported external template at channel c. Note that the template temporal window is derived from the external template directly. The external template is assumed to match the sampling frequency of the recording data that is being hybridized.
D Automatic merging
The merging framework for a specific groundtruth spike train consists of the following steps:

1)
Compute the correspondence between the groundtruth spike train and all automatically recovered spike clusters in terms of precision and recall. More information on those performance metrics can be found in “Performance metrics calculation”.

2)
Sort all clusters on descending precision, such that the cluster with the highest fraction of true spike times is on top of the list.

3)
Merge the ordered clusters together in a topdown fashion, i.e. starting from the cluster with the highest precision, as long as the merge operation increases the F_{1}score of the new cluster that contains all previously merged clusters.
Initially, the merging of clusters with a high precision will increase the sensitivity, at only a very small drop in precision. Such a merging will likely lead to an increase in F_{1}score. At a certain point, clusters will start containing significant amounts of false positives that will notably decrease the precision of the merged cluster. This decrease will then result in a decreasing F_{1}score. The proposed approach tries to find the combination of clusters with maximal F_{1}score, without explicitly having to consider all possible combinations, preventing a combinatorial explosion from happening.
Rights and permissions
About this article
Cite this article
Wouters, J., Kloosterman, F. & Bertrand, A. SHYBRID: A Graphical Tool for Generating Hybrid GroundTruth Spiking Data for Evaluating Spike Sorting Performance. Neuroinform 19, 141–158 (2021). https://doi.org/10.1007/s12021020094748
Published:
Issue Date:
Keywords
 Spike sorting
 Validation
 Hybrid ground truth
 GUI