is a parallel debugging session started by mpirun.
% mpirun -dbg=idb -np 8 cpi
Intel(R) Debugger for ItaniumŪ-based applications, Version XX
Reading symbolic information ...done
stopped at [void* MPIR_Breakpoint(void):101 0x40000000000b3060]
101 {
Process has exited
(idb)
[0:7] Intel(R) Debugger for ItaniumŪ-based applications, Version XX
[0:7] ------------------
[0:7] object file name: /home/user/examples/cpi
[0:7] Reading symbolic information ... [0:7] done
%1 [0:7] Attached to process id [30596;30636] ....
[1:7] stopped at [ 0x20000000001ef962]
[0] stopped at [void* MPIR_Breakpoint(void):101 0x40000000000b3060]
[0] 101 {
(idb)
[0:7] stopped at [int main(int, char**):20 0x4000000000003520]
[0:7] 20 MPI_Init(&argc,&argv);
(idb)
[0:7] 16 double startwtime = 0.0, endwtime;
[0:7] 17 int namelen;
[0:7] 18 char processor_name[MPI_MAX_PROCESSOR_NAME];
[0:7] 19
[0:7] > 20 MPI_Init(&argc,&argv);
[0:7] 21 MPI_Comm_size(MPI_COMM_WORLD,&numprocs);
[0:7] 22 MPI_Comm_rank(MPI_COMM_WORLD,&myid);
[0:7] 23 MPI_Get_processor_name(processor_name,&namelen);
[0:7] 24
(idb) stop in f
(idb)
[0:7] [#1: stop in double f(double) ]
(idb) focus [0:3]
[0:3]> cont
[0:3]> Process 3 on nht6005.spt.intel.com
Process 2 on nht6005.spt.intel.com
Process 0 on nht6005.spt.intel.com
Process 1 on nht6005.spt.intel.com
[0:3] [1] stopped at [double f(double):7 0x4000000000003390]
[0:3] 7 {
[0:3]> where
[0:3]>
[0:3] >0 0x4000000000003390 in f(a=<no value>) "cpi.c":7
%2 [0:3] #1 0x4000000000003a30 in main(argc=0, argv=0x[0;80000fffffffba7c]) "cpi.c":51
[0:3] #2 0x20000000000906b0 in /lib/libc.so.6.1
[0:3] #3 0x4000000000003220 in _start(...) in /home/user/examples/cpi
[0:3]> focus [4:7]
[4:7]>
[4:7]> cont
[4:7]> Process 7 on nht6005.spt.intel.com
Process 4 on nht6005.spt.intel.com
Process 6 on nht6005.spt.intel.com
Process 5 on nht6005.spt.intel.com
[4:7] [1] stopped at [double f(double):7 0x4000000000003390]
[4:7] 7 {
[4:7]> where
[4:7]>
[4:7] >0 0x4000000000003390 in f(a=<no value>) "cpi.c":7
%3 [4:7] #1 0x4000000000003a30 in main(argc=0, argv=0x[0;80000fffffffba7c]) "cpi.c":51
[4:7] #2 0x20000000000906b0 in /lib/libc.so.6.1
[4:7] #3 0x4000000000003220 in _start(...) in /home/user/examples/cpi
[4:7]> focus [*]
[0:7]>
[0:7]> next
[0:7]>
[0:7] stopped at [double f(double):8 0x40000000000033b1]
[0:7] 8 return (4.0 / (1.0 + a*a));
[0:7]> where
[0:7]>
%4 [0:7] >0 0x40000000000033b1 in f(a=[0.0050000000000000001;0.074999999999999997]) "cpi.c":8
%5 [0:7] #1 0x4000000000003a30 in main(argc=1, argv=0x[80000fffffffb768;6000000000014a50]) "cpi.c":51
[0:7] #2 0x20000000000906b0 in /lib/libc.so.6.1
[0:7] #3 0x4000000000003220 in _start(...) in /home/user/examples/cpi
[0:7]> show aggregated message
%1 [0:7] Attached to process id [30596;30636] ....
%2 [0:3] #1 0x4000000000003a30 in main(argc=0, argv=0x[0;80000fffffffba7c]) "cpi.c":51
%3 [4:7] #1 0x4000000000003a30 in main(argc=0, argv=0x[0;80000fffffffba7c]) "cpi.c":51
%4 [0:7] >0 0x40000000000033b1 in f(a=[0.0050000000000000001;0.074999999999999997]) "cpi.c":8
%5 [0:7] #1 0x4000000000003a30 in main(argc=1, argv=0x[80000fffffffb768;6000000000014a50]) "cpi.c":51
[0:7]>
[0:7]> expand aggregated message 1
%1 [0:7] Attached to process id [30596;30636] ....
[3] Attached to process id 30612 ....
[2] Attached to process id 30606 ....
[0] Attached to process id 30596 ....
[1] Attached to process id 30600 ....
[4] Attached to process id 30618 ....
[5] Attached to process id 30624 ....
[7] Attached to process id 30636 ....
[6] Attached to process id 30630 ....
[0:7]> disable 1
[0:7]>
[0:7]> cont
[0:7]> pi is approximately 3.1416009869231249, Error is 0.0000083333333318
wall clock time = 69.300781
[0:7] Process has exited with status 0
[0:7]> quit
The following are explanatory notes from the previous example:
Component of Example |
Meaning |
---|---|
-np 8 | This parallel session creates 8 processes. |
[0:7] | This is a message from processes 0 to 7. |
%1 | This aggregated message contains messages with differing portions (in this case, the process id's are different from process to process), and 1 is the message id. |
focus [0:3] | This focus command sets the current process set to include processes 0, 1, 2, and 3. |
[0:3]> | This prompt shows the current process set. |
show aggregated message | This show aggregated message command displays all the aggregated messages saved in the message list. |
expand aggregated message 1 | This expand aggregated message command expands the aggregated message with message id 1. |
demonstrates how to start a parallel debugging session with prun.
% idb -parallel `which prun` -n 16 -N 8 ./cpi
Intel(R) Debugger for ItaniumŪ-based applications, Version 7.0,
Build 20021118
Reading symbolic information ...done
stopped at [void _rms_breakpoint(void):2150 0x20000000001913e0]
Source file not found or not readable, tried...
./loader.cc
/usr/bin/loader.cc
(Cannot find source file loader.cc)
stopped at [void _rms_breakpoint(void):2150 0x20000000001913e0]
Source file not found or not readable, tried...
./loader.cc
/usr/bin/loader.cc
(Cannot find source file loader.cc)
Process has exited
(idb)
[0:15] Intel(R) Debugger for ItaniumŪ-based applications,
Version 7.0, Build 20021118
[0:15] ------------------
[0:15] object file name: cpi
[0:15] Reading symbolic information ... [0:15] done
[0:15] 13 int done = 0, n, myid, numprocs, i;
(idb) where
(idb)
[0:15] >0 0x4000000000000c60 in _start(...) in cpi
(idb) stop in main
(idb)
[0:15] [#1: stop in int main(int, char**) ]
(idb) cont
(idb)
[0:15] [1] stopped at [int main(int, char**):13 0x4000000000000f52]
[0:15] 13 int done = 0, n, myid, numprocs, i;
In this example, the first couple of messages about not being able to find the file loader.cc can be ignored; they are caused by the fact that this file usually does not exist on a production system.