Vibha Patel, Krunal Gandhi, Darshak Bhatti


Among various algorithms for protein and nucleotide alignment, Needleman-Wunsch algorithm is widely accepted as it can divide the problem into sub-problems. We present two parallel approaches of the Needleman-Wunsch algorithm with the single kernel and multi-kernel invocation using skewing transformation which is used for traversing and calculation of dynamic programming matrix. We also compare these with traditional CPU sequential approach which resulted in six-fold performance improvement. Furthermore, we present same single kernel ideology on shared memory which resulted in two-fold performance improvement our non-shared memory approach.


Skewing Transformation; CUDA; Single Kernel; Multi-Kernel; Shared Memory; lock-free synchronization

Full Text:



S. B. Needleman and C. D. Wunsch, “A general method applicable to the search for similarities in the amino acid sequence of two proteins,” Journal of molecular biology, vol. 48, no. 3, pp. 443–453, 1970.

T. F. Smith and M. S. Waterman, “Identification of common molecular subsequences,” Journal of molecular biology, vol. 147, no. 1, pp. 195–197, 1981.

A. Layeb, S. Meshoul, and M. Batouche, “A hybrid method for effective multiple sequence alignment,” in . IEEE Symposium on Computers and Communications, 2009. ISCC 2009, pp. 970–975.

“NVIDIA CUDA Programming Guide, Version 4.2.” [Online]. Available: http://developer.download.nvidia.com/compute/DevZone/docs/html/C/doc/CUDA C Programming Guide.pdf

A. Chaudhary, D. Kagathara, and V. Patel, “A gpu based implementation of needleman-wunsch algorithm using skewing transformation,” in Eighth IEEE International Conference on Contemporary Computing (IC3), 2015, pp. 498–502.

S. F. Altschul, W. Gish, W. Miller, E. W. Myers, and D. J. Lipman, “Basic local alignment search tool,” Journal of molecular biology, vol. 215, no. 3, pp. 403–410, 1990.

C. Angerm ̈uller, A. Biegert, and J. Soding, “Discriminative modelling of context-specific amino acid substitution probabilities,” Bioinformatics, vol. 28, no. 24, pp. 3240–3247, 2012.

S. A. Manavski and G. Valle, “Cuda compatible gpu cards as efficient hardware accelerators for smith-waterman sequence alignment,” BMC bioinformatics, vol. 9, no. 2, p. 1, 2008.

L. Ligowski and W. Rudnicki, “An Efficient Implementation of Smith Waterman Algorithm on GPU Using CUDA, for Massively Parallel Scanning of Sequence Databases,” in Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing, ser. IPDPS ’09, 2009, pp. 1–8.

A. Khajeh-Saeed, S. Poole, and J. B. Perot, “Acceleration of the smith–waterman algorithm using single and multiple graphics processors,” Journal of Computational Physics, vol. 229, no. 11, pp. 4247–4258, 2010.

A. Khalafallah, H. F. Elbabb, O. Mahmoud, and A. Elshamy, “Optimizing smith-waterman algorithm on graphics processing unit,” in 2nd IEEE International Conference on Computer Technology and Development (ICCTD), 2010, pp. 650–654.

J. Li, S. Ranka, and S. Sahni, “Pairwise sequence alignment for very long sequences on gpus,” International Journal of Bioinformatics Research and Applications, vol. 10, no. 4-5, pp. 345–368, 2014.

Z. Cui, Y. Liang, K. Rupnow, and D. Chen, “An accurate gpu performance model for effective control flow divergence optimization,” in IEEE 26th International Symposium on Parallel & Distributed Processing Symposium (IPDPS), 2012. IEEE, 2012, pp. 83–94.

W. Liu, B. Schmidt, G. Voss, and W. M ̈uller-Wittig, “Gpu-clustalw: using graphics hardware to accelerate multiple sequence alignment,” in High Performance Computing - HiPC 2006, Springer, 2006, pp. 363–374.

C. S. Khaladkar, “An efficient implementation of needleman wunsch algorithm on graphical processing units,” Honours Programme of the School of Computer Science and Software Engineering, The University of Western Australia , 2009.

R. Farivar, H. Kharbanda, S. Venkataraman, and R. H. Campbell, “An algorithm for fast edit distance computation on gpus,” in Innovative Parallel Computing (InPar), 2012, IEEE, 2012, pp. 1–9.

T. Siriwardena and D. Ranasinghe, “Accelerating global sequence alignment using cuda compatible multi-core gpu,” in The 5th IEEE International Conference on Information and Automation for Sustainability (ICIAFs), 2010, pp. 201–206.

DOI: https://doi.org/10.26483/ijarcs.v8i9.4953


  • There are currently no refbacks.

Copyright (c) 2017 International Journal of Advanced Research in Computer Science