EDA365欢迎您登录!
您需要 登录 才可以下载或查看,没有帐号?注册
x
最近在学习opencl,写了一段smith-waterman算法计算得分矩阵的程序,运行在FPGA上反而比CPU上性能差。因为是初学,不知道写的哪里有问题。5 f, w- H4 J: p; `
附上代码: __kernel void __attribute__ ((reqd_work_group_size(512,512,1)))krnl_sw( __global int* ref, __global int* alt, __global int* sw, __global int* btrack, const int overhangStrategy, const int match, const int mismatch, const int open, const int extend, const int ncol, const int nrow ) { int col = get_global_id(0); int row = get_global_id(1); for (k = 2;k<ncol+nrow-1;k++){ if(col + row == k) { up_score = sw[(row - 1)*ncol + col] + extend*(row-1) + open ; left_score = sw[row*ncol + col - 1] + extend*(row-1) + open ; up_left_score = sw[(row - 1)*ncol + col - 1] + diag_score(ref[col - 1], alt[row - 1], match, mismatch); sw[row*ncol + col] = max(up_score, left_score); sw[row*ncol + col] = max(up_left_score, sw[row*ncol + col]); }} return;}大致的计算过程是初始化矩阵第一行和第一列,然后延对角线放向逐次计算斜对角上矩阵的得分。
( o$ H5 p. N9 j7 H" e" f" {![]() 计算512*512的矩阵,运行10次,CPU上耗时100ms不到,FPGA上反而是几十秒,慢了近1000倍....
4 R- n$ D" m- m# E# T g% u |