MPI并行求向量和

  • Post author:
  • Post category:其他


教材是《并行程序设计导论》,代码参考该教材。

用了MPI_Scatter和MPI_Gather(Allgather也一样,不写destination的参数就行了

四个进程并行跑得比串行慢,问了其他人他们也是这个结果,猜测是通信开销大的原因。有大佬知道并且愿意告诉我的欢迎留言qwq

代码中注释掉的部分是测试时从命令行输入向量使用的,那个时候把n设了12。因为要求是n=10000,所以我直接让第i个元素等于i,并且只输出了i=1126对应的元素,以及我们需要的运行时间。

编程过程中遇到的一个坑是,我一直以为char *和char []是一样的,所以直接给Read_vectorh和Print_vector传递了char *类型的参数,于是一直报warning(事实上直接传”string”就可以qwq)

#include <stdio.h>

#include <string.h>

#include <mpi.h>

void Read_vector(double  local_a[], int local_n, int n, char vec_name[], int my_rank, MPI_Comm comm){


double* a = NULL;

int i;

if(my_rank == 0){


a = malloc(n*sizeof(double));

//printf(“Enter the vector %s\n”,vec_name);

for(i = 0; i < n; i++)

a[i] = i;//scanf(“%lf”, &a[i]);

MPI_Scatter(a, local_n, MPI_DOUBLE, local_a, local_n, MPI_DOUBLE, 0, comm);

free(a);

} else{


MPI_Scatter(a, local_n, MPI_DOUBLE, local_a, local_n, MPI_DOUBLE, 0, comm);

}

}

void Print_vector(double local_b[], int local_n, int n, char title[], int my_rank, MPI_Comm comm){


double* b = NULL;

//int i;

if(my_rank == 0){


b = malloc(n*sizeof(double));

MPI_Gather(local_b, local_n, MPI_DOUBLE, b, local_n, MPI_DOUBLE, 0, comm);

printf(“%s\n”, title);printf(“%f “, b[1126]);

/*for(i = 0; i < n; i++)

printf(“%f “, b[i]);*/

printf(“\n”);

free(b);

} else {


MPI_Gather(local_b, local_n, MPI_DOUBLE, b, local_n, MPI_DOUBLE, 0, comm);

}

}


void Parrel_vector_sum(double local_x[], double local_y[], double local_z[], int local_n){


int local_i;

for(local_i = 0; local_i < local_n; local_i++)

local_z[local_i] = local_x[local_i] + local_y[local_i];

}

int main(void){

int comm_sz;

int my_rank;

int n = 10000;

double *local_a,*local_b,*local_c;

char A[10]=”A”;

char B[10]=”B”;

char C[50]=”the 1126th element of’A + B’ is “;

MPI_Init(NULL, NULL);

MPI_Comm_size(MPI_COMM_WORLD, &comm_sz);

MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);

int local_n = n/comm_sz;

local_a = malloc(local_n*sizeof(double));

local_b = malloc(local_n*sizeof(double));

local_c = malloc(local_n*sizeof(double));

Read_vector(local_a, local_n, n, A, my_rank, MPI_COMM_WORLD);

Read_vector(local_b, local_n, n, B, my_rank, MPI_COMM_WORLD);

double start = MPI_Wtime();

Parrel_vector_sum(local_a, local_b, local_c, local_n);

Print_vector(local_c, local_n, n, C, my_rank, MPI_COMM_WORLD);

double finish = MPI_Wtime();

printf(“the time needed is %e\n”,finish-start);

MPI_Finalize();

return 0;

}

运行之后是这样的:

有任何错误欢迎指正。